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Abstract 

We  study  the  problem  of  routing  in  doubling  metrics,  and  show  how  to  perform  hierarchical  routing  in  such  metrics  with  small  stretch  and  compact 
routing  tables  (i.e.,  with  a  small  amount  of  routing  information  stored  at  each  vertex).  We  say  that  a  metric  {X^d)  has  doubling  dimension  dim(X) 
at  most  a  if  every  set  of  diameter  D  can  be  covered  by  2“  sets  of  diameter  Djl.  (A  doubling  metric  is  one  whose  doubling  dimension  dim(X)  is  a 
constant.)  For  a  connected  graph  G,  whose  shortest  path  distances  dg  induce  the  doubling  metric  {X^dc),  we  show  how  to  perform  (1  -\-'^)-stretch 
routing  on  G  for  any  0  <  T  <  1  with  routing  tables  of  size  at  most  log  Alog5  with  only  (a/T)^^“^  log  A  entries,  where  A  is  the  diameter 

of  G  and  8  is  the  maximum  degree  of  G.  Hence  the  number  of  routing  table  entries  is  just  logA/or  doubling  metrics.  These  results  extend 

and  improve  on  those  ofTalwar  (2004). 
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1  Introduction 

The  doubling  dimension  of  a  metric  space  {X^d)  is  the  least  value  a  such  that  each  ball  of  radius  R  can  be 
covered  by  at  most  2“  balls  of  radius  R/2  [12],  For  any  a  e  Z,  the  space  under  any  of  the  Ip  norms 
has  doubling  dimension  0(a),  and  hence  this  doubling  dimension  extends  the  standard  notion  of  geometric 
dimension;  moreover,  it  can  be  seen  as  a  way  to  parameterize  the  inherent  “complexity”  of  metrics. 

In  this  paper,  we  study  the  problem  of  designing  routing  algorithms  for  networks  whose  structure  is 
parameterized  by  the  doubling  dimension  dim(X)  =  a;  we  show  that  one  can  route  along  paths  with  stretch 
(1  +x)  with  small  routing  tables — with  only  0((a/x)^(“Uog A)  entries,  where  A  is  the  diameter  of  the 
network  G.  Each  entry  stores  at  most  0(log6)  bits,  where  6  is  the  maximum  degree  of  G,  and  hence 
for  doubling  metrics — where  a  is  a  constant — and  any  x  <  1,  we  have  (1  +  x) -stretch  routing  with  only 
0(logAlog5)  bits  of  routing  information  at  each  node. 

The  idea  of  placing  restrictions  on  the  growth  rate  of  networks  to  bound  their  “intrinsic  complexity”  is 
by  no  means  novel;  it  has  been  around  for  a  long  time  (see,  e.g.,  [16]),  and  has  recently  been  used  in  several 
contexts  in  the  literature  on  object  location  in  peer-to-peer  networks  [21,  15,  14].  While  these  papers  used 
definitions  and  restrictions  that  differ  slightly  from  each  other,  we  note  that  our  results  hold  in  those  models 
as  well.  Our  results  extend  those  of  Talwar  [23],  whose  routing  schemes  for  metrics  with  dim(2f)  =  a  require 
local  routing  information  of  «  0(log“  A)  bits.  Formally,  we  have  the  following  main  result. 

Theorem  1.1.  Given  any  network  G,  whose  shortest  path  distances  do  induce  the  doubling  metric  {X^dc) 
with  dim(2f)  =  a,  and  any  X  >  0,  there  is  a  routing  scheme  on  G  that  achieves  (1  -\-x)-stretch  and  where 
each  node  stores  only  (^)'^(“UogAlog5  bits  of  routing  information,  where  A  is  the  diameter  of  G  and  5  is 
the  maximum  degree  ofG. 

The  proof  of  the  theorem  proceeds  along  familiar  lines;  we  construct  a  set  of  hierarchical  decomposi¬ 
tions  (HDs)  of  the  metric  {X,d),  where  each  HD  consists  of  a  set  of  successively  finer  partitions  of  X  with 
geometrically  decreasing  diameters.  Each  node  in  X  maintains  a  table  containing  next  hops  to  a  small  subset 
of  clusters  in  these  partitions;  to  route  a  packet  from  s  to  t,  we  use  the  routing  table  for  s  to  pick  some  “small 
cluster”  C  in  s"  table  that  contains  t  and  send  the  packet  to  some  node  x  in  C;  a  similar  process  repeats  at 
node  X  e  C  until  the  packet  reaches  t.  The  idea  is  to  create  routing  tables  which  ensure  that  the  distance  from 
X  to  t  is  much  smaller  than  that  from  s  to  t,  and  hence  the  detour  taken  in  going  from  5  to  t  is  only  xd{sf). 
(Details  of  routing  schemes  appear  in  Section  4  and  5.) 

While  this  framework  is  well-known,  the  standard  ways  to  construct  HDs  are  top-down  methods 
which  iteratively  refine  partitions.  These  mefhods  creafe  long-range  dependencies  which  require  us  fo  build 
0{\ogn)  HDs  in  general;  in  order  fo  use  fhe  localify  of  fhe  doubling  mefrics  and  gel  away  wilh  0{a)  HDs, 
we  develop  a  bolfom-up  approach  fhal  avoids  fhese  dependencies  when  building  HDs.  The  analysis  of  fhis 
process  uses  fhe  Eovasz  Eocal  Eemma  (much  as  in  [17,  12]);  delails  are  given  in  Section  3. 

1.1  Related  Work  Disfribuled  packel  routing  prolocols  have  been  widely  sludied  in  fhe  Iheorelical 
computer  science  communily;  see,  e.g.,  [8,  9,  2,  19,  6,  20],  or  fhe  survey  by  Gavoille  [10]  on  some  of  fhe 
issues  and  lechniques.  Nofe  fhal  fhese  resulls,  however,  are  usually  for  general  nelworks,  or  for  nelworks 
wilh  some  topological  slruclure.  By  placing  reslriclions  on  fhe  doubling  dimension,  we  are  able  fo  give 
resulls  which  degrade  gracefully  as  fhe  “complexify”  of  fhe  melric  increases.  For  example,  if  is  known  fhal 
any  universal  routing  algorilhm  wilh  slrelch  less  lhan  3  requires  some  node  to  store  al  leasl  Q.{n)  rouling 
informalion  [11];  however,  fhese  graphs  generate  mefrics  wilh  large  dim(X).  Our  resulls  Ihus  allow  one  to 
circumvenl  fhese  lower  bounds  for  mefrics  of  “lower  dimension”. 

Packel  rouling  in  low  dimensional  nelworks  has  been  previously  sludied  in  Talwar  [23],  lhal  gives 
algorilhms  lhal  require  0(a(:^)“(log“''“^  A))  bils  of  information  to  be  stored  per  node  in  order  to  achieve 
(1  -f  x)-slreteh  routing — for  conslanl  slrelch  x  and  doubling  dimension  a.  The  resulting  dependence  of 
C?(log^''““A)  should  be  conlrasled  wilh  Ihe  dependence  of  f?(logAlog5)  bils  of  information  in  our  schemes. 
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We  should  point  out  that  his  algorithms  are  based  on  graph  deeomposition  ideas  with  a  top-down  approaeh 
and  do  not  require  the  LLL  to  eonstruet  routing  tables. 

One  of  the  papers  that  influenee  this  work  is  that  of  Kleinroek  and  Kamoun  [16].  They  deseribe  a 
general  hierarehieal  elustering  model  on  whieh  our  routing  sehemes  are  based.  They  show  that  routing 
sehemes  based  on  a  hierarehieal  elustering  model  do  not  eause  mueh  inerease  in  the  average  path  length 
for  networks  that  satisfy  the  following  two  assumptions:  (a)  the  diameter  of  any  eluster  S  ehosen  is  bounded 
above  by  f?(|5'|'')  for  some  eonstant  v  G  [0, 1],  and  (b)  the  average  distanee  between  nodes  in  the  network 
is  0(n'').  In  eontrast,  we  give  bounds  on  the  path  streteh  on  a  per  node-pair  level  using  slightly  different 
assumptions  on  the  network  geometry. 

Other  papers  on  objeet  loeation  in  peer-to-peer  networks  [21,  15,  14]  have  also  used  restrietions  similar 
to  [16]  on  the  growth  rate  of  metries;  in  partieular,  they  eonsider  metries  where  inereasing  the  radius  of 
any  ball  by  a  faetor  of  2  eauses  the  number  of  points  in  it  to  inerease  by  at  most  some  eonstant  faetor  2^. 
(Plaxton  et  al.  [21]  also  eonsider  the  lower  bound  on  the  growth.)  Here  the  parameter  P  ean  be  eonsidered  to 
be  another  notion  of  “dimension”  for  a  metrie  spaee.  It  ean  be  shown  that  dim  (X)  <4P[12,Prop.  1.2];henee 
our  results  hold  for  sueh  metries  as  well.  Our  seheme  is  also  similar  in  spirit  to  a  data-traeking  seheme  of 
Rajaraman  et  al.  [22],  who  use  approximations  by  tree  distributions  to  obtain  bounds  on  the  streteh  ineurred. 


2  Definitions  and  Notation 

Let  the  input  metrie  be  {X^d)',  this  paper  deals  with  finite  metries  with  at  least  2  points.  We  use  standard 
terminology  from  the  theory  of  metrie  spaees;  many  definitions  ean  be  found  in  [7]  and  [13].  Given  x  €  X 
and  r  >  0,  we  let  B(x, r)  denote  {x'  €  X  |  r/(x,x')  <  r},  i.e.,  the  ball  of  radius  r  around  x.  Given  a  subset 
sex,  the  distanee  of  x  €  X  to  the  set  S  is  d{x,S)  =  min{r/(x,x')  |  x'  €  S}. 

The  doubling  constant  Xx  of  a  metrie  spaee  {X,d)  is  the  smallest  value  X  sueh  that  every  ball  in  X  ean 
be  eovered  by  X  balls  of  half  the  radius.  The  doubling  dimension  of  X  is  then  defined  as  dim(2f)  =  log2?tx; 
we  use  fhe  leffer  a  fo  denofe  dim(X).  A  mefrie  is  ealled  doubling  when  ifs  doubling  dimension  is  a  eonsfanf. 
A  subsef  T  C  A  is  an  r-net  of  X  if  (1)  for  every  x,y  G  Y,d{x,y)  >  r  and  (2)  X  C  UygyB(y,r).  Sueh  nefs 
always  exisf  for  any  r  >  0,  and  ean  be  found  using  a  greedy  algorifhm. 

Proposition  2.1.  (see,  E.G.,[12])  If  all  pairwise  distances  in  a  setY  C  A  are  at  least  r  (e.g.,  when  Y  is 
an  r-net  ofX),  then  for  any  point  x  G  A  and  radius  t,  we  have  that  |B(x,f)  nT|  < 

A  cluster  C  in  the  metrie  {X,d)  is  just  a  subset  of  points  of  the  set  A.  The  diameter  of  the  eluster  C 
is  the  largest  distanee  between  points  of  the  eluster.  Eaeh  eluster  is  assoeiated  with  a  center  x  G  A  {which 
may  not  lie  in  C)  and  the  radius  of  the  eluster  C  is  the  smallest  value  r  sueh  that  the  eluster  C  is  eontained  in 
B(x,r). 

Definition  2.1.  Given  r  >  0,  an  r-ball  partition  IT  of  {X,d)  is  a  partition  ofX  into  clusters  Ci,C2, . . ., 
with  each  cluster  Ci  having  a  radius  at  most  r. 


By  sealing,  let  us  assume  that  the  smallest  inter-point  distanee  in  A  is  exaetly  1.  Let  A  denote  the 
diameter  of  the  metrie  {X,d),  and  henee  A  is  also  the  aspeet  ratio  of  the  metrie.  Define  p  =  256a -f  1  and 


/i  = 


logp  A  .  Lef  us  define  r],-  =  1  +  p  -fp^  + . . .  4- p'  <  p'+^/(p  —  1);  nofe  fhaf  r],-  =  pTi,_i  -f  1.  Lef  us  fix  a 


p'/2-nef  and  denofe  wifh  A;  for  fhe  mefrie  {X,d),  for  every  0  <  i  <  h-\- 


2.1  Hierarchical  Decompositions  (HDs)  We  now  give  a  formal  definition  of  a  hierarchical  decomposi¬ 
tion  (HD)  whieh  is  used  throughout  this  paper  and  is  the  basie  objeet  of  our  study.  As  noted  below,  sueh  a 
deeomposition  ean  be  naturally  assoeiated  with  a  deeomposition  tree  that  is  used  for  our  hierarehieal  routing 
sehemes. 
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Definition  2.2.  A 
rio, ...  ,11/,  with  h  = 


p -hierarchical  decomposition  11  fp-HDj  of  the  metric  (X^d)  is  a  sequence  of  partitions 
logp  a]  such  that: 


1.  The  partition  IT/,  has  one  cluster  X,  the  entire  set. 

2.  (geometrically  decreasing  diameters)  The  partition  11/  is  an  r\i-ball  partition.  Since  inter-point 
distances  are  at  least  1,  it  implies  that  ITo  =  {{t}  :  x  €  X};  in  other  words,  each  cluster  in  ITo  is  a 
singleton  vertex. 


3.  (hierarchical)  n,-  is  a  refinement  o/n,_|_i  and  each  cluster  in  IT/  is  contained  within  some  cluster  of 

n,+i. 


Given  such  a  p-HD  II  =  (n,)^^Q,  the  partition  IT,  is  called  the  level-i  partition  of  II  and  clusters  in  IT,  are  the 
level-i  clusters.  Note  that  these  clusters  have  a  radius  r],  and  hence  diameter  <  2ri,.  Furthermore,  define  the 
degree  deg  (IT)  to  be  the  maximum  number  of  level-/  clusters  contained  in  any  level-(/+  1)  cluster  in 
for  all  0  <  /  <  /i  —  1. 


2.1,1  Hierarchical  Decompositions  and  HSTs  A  hierarchical  decomposition  is  a  laminar  family  of  sets, 
where  given  any  two  sets,  they  are  either  disjoint  or  one  contains  the  other.  It  is  well  known  that  such  a 
family  f  of  sets  over  X  can  be  associated  with  a  natural  decomposition  tree  whose  vertices  are  sets  in  f 
and  whose  leaves  are  all  the  smallest  sets  in  the  family  (which  are  elements  of  X,  in  this  case).  We  can  use 
this  to  associate  a  so-called  hierarchically  well-separated  tree  (also  called  an  HST  [3])  Tn  with  a  hierarchical 
decomposition  H;  since  each  edge  in  Tn  connects  some  C  €  IT,  and  C'  E  n,_i  with  C'  C  C,  we  associate  a 
length  r\i  with  edge  (C,C').  Given  such  a  tree  Tn,  we  can  (and  indeed  do)  talk  about  its  level-/  clusters  with 
no  ambiguity;  these  are  the  same  level-/  clusters  in  the  associated  IT,-.  Note  that  the  degree  of  vertices  in  this 
tree  Tn  is  bounded  by  deg  (11)  +  1. 

2.2  Padded  Probabilistic  Ball-Partitions  Recall  that  an  r-ball  partition  IT  of  {X^d)  is  a  partition  of  X 
into  a  set  of  clusters  C  CX,  each  contained  in  a  ball  B(v,  r)  for  some  v  EX.  B(x,/)  is  cut  in  the  partition  IT 
if  there  is  no  cluster  C  €  IT  such  that  B(x,/)  C  C.  In  general,  B(x,/)  is  cm/ by  a  set  5  CX  if  both  5'nB(x,/) 
and  B  (x,  /  )  \  5  are  non-empty. 

Let  iP  be  a  collection  of  all  possible  partitions  of  X,  and  hence  II  €  iP.  Given  a  partition  IT  €  iP  and 
X  eX,  let  Cn(x)  be  the  cluster  of  IT  containing  x. 

Definition  2.3.  ([12])  An  (r,s)-padded  probabilistic  ball-partition  of  a  metric  (X,d)  is  a  probability 
distribution  p  over  iP  satisfying: 

1.  (bounded  radius)  Each  IT  in  the  support  of  p  is  an  r-ball  partition. 

2.  (padding)  Vx  E  X,  Pr^  [(i(x,X  \Cn(x))  >  sr]  > 

(This  is  called  a  padded  probabilistic  decomposition  in  [12].)  Each  cluster  C  in  every  partition  IT  in  the 
support  of  a  probabilistic  ball-partition  p  has  radius  at  most  r;  and  for  any  x  €  X,  a  random  r-ball  partition 
n  drawn  from  the  distribution  p  does  not  cut  B(x,sr)  (and  hence  B(x,sr)  is  contained  in  cluster  Cn(x)  E  IT) 
with  probability  >  1  /2. 

3  Padded  Probabilistic  Hierarchical  Decompositions 

In  this  section,  we  define  a  (p,s)-padded  probabilistic  hierarchical  decomposition  (PPHD)  of  fhe  mefric 
(X,(i),  on  which  fhe  roufing  algorifhm  is  based.  A  PPHD  is  a  probabilify  disfribufion  over  HDs  fhaf  has  a 
“probabilisfic  padding”  properly  similar  lo  lhal  in  Definition  2.3.  For  any  pair  of  nodes  s,  t  inX  and  any 
ball  confaining  bolh  s  and  /  wilh  a  diameter  of  d{sf),  fhe  PPHD  ensures  lhal  Ibis  ball  is  conlained  in  a 
single  cluster  of  radius  only  slighlly  («  a  factor)  larger  lhan  d{s,t)  al  a  suilable  level  wilh  probabilily  >  5. 
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Thus  the  shortest  s-t  path  is  eontained  entirely  in  this  eluster  of  radius  not  much  more  than  d{s^t).  This  is 
the  general  intuition  for  PPHDs  and  the  starting  point  for  the  routing  algorithm. 

For  our  applications,  we  refine  PPHDs  so  that  they  consist  of  only  m  =  O(aloga)  of  HDs.  We  first 
give  an  existence  proof,  using  the  Lovasz  Local  Lemma  (LLL),  to  show  that  such  decompositions  exist  in 
Section  3.1.  We  then  outline  a  randomized  polynomial- time  algorithm  to  find  fhe  decompositions  using 
Beck’s  fechniques  [4]  in  Section  3.2. 

The  exisfence  proof  for  fhe  PPHDs  has  fhe  following  oufline.  We  firsl  give  a  randomized  algorilhm 
fo  form  a  single  random  hierarchical  decomposition  11,  which  proves  fhe  exisfence  of  PPHDs,  albeif  wifh 
supporf  over  an  exponenfial  number  of  HDs.  To  reduce  fhe  size  fo  somefhing  fhaf  depends  only  on  a,  we 
have  fo  use  fhe  localify  properly  of  fhe  melric  space  and  fhe  LLL.  One  significanl  complication  in  fhe  proof 
is  fhaf  we  cannof  use  fhe  sfandard  lop-down  decomposilion  schemes  fo  conslrucl  PPHDs,  since  Ihey  have 
long-range  correlations  fhaf  preclude  fhe  application  of  fhe  LLL.  Our  solution  fo  Ihis  problem  is  fo  build  fhe 
decomposition  frees  in  a  bollom-up  fashion  and  fo  make  sure  fhaf  fhe  coarser  partitions  respecf  fhe  cluster 
boundaries  made  in  fhe  finer  partitions. 

3.1  Existence  of  PPHDs  Motivated  by  fhe  routing  applicalion,  we  are  interested  in  finding  fhe  following 
slruclure,  which  we  call  a  {p,e)-padded  probabilistic  hierarchical  decomposition.  This  is  a  probabilify 
dislribulion  p  over  p -hierarchical  decompositions  (as  defined  in  Definition  2.2)  so  fhaf  given  B(x,sr)  wifh 
r  PS  p',  if  we  choose  a  random  p-HD  11  from  p  and  examine  the  partition  H,  in  it,  B(x,  r)  is  cut  in  this 
partition  H,  with  probability  at  most 

Definition  3.1.  (PPHD)  A  (p,£)-padded  probabilistic  hierarchical  decomposition  (referred  to  as  a 
{P,e)-PPHD)  is  a  distribution  p  over  p-hierarchical  decompositions,  such  that  for  any  point  x  E  X  and 
any  value  r  s.t.  p‘^^  <  t  <  p‘, 

PrnG^[B(v,sr)  is  cut  in  H,]  < 

where  the  random  p-hierarchical  decomposition  chosen  is  11  =  (n,  )^^Q.  The  degree  of  the  PPHD  p  is  defined 
to  be  deg(q)  =  maxnG^deg(n). 

Note  that  the  definition  of  a  PPHD  extends  both  the  idea  of  a  padded  probabilistic  ball-partition  and 
that  of  HDs — we  ask  for  a  distribution  over  entire  HDs,  instead  of  over  ball-partitions  at  a  certain  scale  r. 
However,  having  picked  a  random  p-HD  II  =  (n;)*^Q  from  this  distribution,  we  demand  that  balls  of  radius 
PS  sp'  be  cut  with  small  probability  only  in  partition  H;  that  is  “at  the  correct  distance  scale”.  Our  main 
theorem  of  this  section  is  the  following: 

Theorem  3.1.  Given  a  metric  {X,d),  there  exists  a  (p,s)-PP//D  p  for  {X,d)  with  p  =  0(a)  and  s  = 
0(\/a).  The  degree  deg(p)  of  the  PPHD  is  at  most  Furthermore,  there  exists  a  distribution  p^ 

whose  support  is  over  only  m  =  O(aloga)  HDs. 

Since  any  hierarchical  decomposition  II  can  be  associated  with  a  tree  Pn  (as  mentioned  in  Section  2.1), 
the  above  theorem  can  be  viewed  as  guaranteeing  a  set  of  m  trees  such  that  the  level-/  clusters  in  half  of 
these  trees  do  not  cut  a  given  ball  of  radius  ps  sp'.  This  proves  the  existence  of  an  appropriate  tree  cover. 

Definition  3.2.  A  stretch-k  Steiner  tree  cover  for  {X,d)  is  a  set  of  trees  F  =  {Ti T^}  (with  each  tree 
Ti  possibly  containing  Steiner  points  ^  X,  and  edges  having  lengths),  where  for  every  x,x'  E  X,  there  exists  a 
tree  7)  E  F  for  {X,d)  such  that  the  (unique  shortest)  path  in  F  between  x  and  x'  has  length  at  most  kd{x,x'). 

Lemma  3.1.  Given  a  metric  {X,d)  with  dim(2f)  =  a,  there  exists  a  stretch-0(p /F)  Steiner  tree  cover 
consisting  of  0(a\oga)  trees,  where  each  tree  has  degree  at  most  a^^^\ 
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We  omit  the  simple  proof  of  the  above  lemma  and  the  deseription  of  how  the  Steiner  points  ean  be  removed 
from  the  trees  without  altering  distanees  and  degrees.  We  prove  Theorem  3.1  in  the  rest  of  this  seetion.  We 
first  prove  (in  Seetion  3.1.1)  that  one  ean  obtain  the  result  where  the  PPHD  /r  has  support  over  many  HDs. 
We  then  use  the  Lovasz  Loeal  Lemma  (in  Seetion  3.1.2)  to  show  that  a  PPHD  distribution  with  support 
over  only  a  small  number  of  HDs  exists. 

3.1.1  Padded  Probabilistic  Hierarchical  Partitions  If  we  do  not  eare  about  the  number  of  HDs  in  the 

support  of  a  PPHD,  the  existenee  result  of  Theorem  3.1  has  been  proved  earlier  [23]  with  better  guarantees; 
the  proof  basieally  follows  from  the  padded  deeompositions  given  in  [12].  However,  we  now  give  another 
proof  that  introduees  ideas  that  are  ultimately  useful  in  obtaining  a  PPHD  distribution  whose  support  is  over 
a  small  number  of  HDs. 

Theorem  3.2.  Given  a  metric  {X,d),  there  exists  a  (p,s)-PP//D  for  (X,d)  with  p  =  0(a)  and  s  = 
0(1  /a),  and  with  degree  deg(/r)  =  Furthermore,  one  can  sample  from  p  in  polynomial  time. 

Proof  We  define  a  randomized  proeess  fhaf  builds  a  random  hierarehieal  deeomposifion  free  in  a  bollom-up 
fashion,  insfead  of  fhe  usual  fop-down  way.  To  build  a  HD  11,  we  sfarf  wifh  (Hq  =  {{x}  :  x  E  X})  and 
perform  an  induefive  step.  Af  any  step,  we  are  given  a  parfial  sfruefure  (H,, . . .  ,no)  where  for  eaeh  j  <  i, 
fhe  elusfers  in  Hy-i  (whieh  is  an  riy_i-ball  partition)  are  eonfained  wifhin  fhe  elusfers  of  Hy.  We  fhen  build 
a  new  partition  n,_|_i,  wifh  all  elusfers  of  H,  being  eonfained  wifhin  elusfers  of  n,+i.  We  have  fo  ensure  fhaf 
elusfers  of  Hi+i  are  eonfained  in  balls  of  radius  af  mosf  and  fhaf  any  ball  of  radius  Sr  for  p'  <  r  <  p'+^ 
is  euf  in  n,+i  wifh  probabilify  af  mosf  This  way,  we  end  up  wifh  a  valid  random  HD  11.  The  elaimed 
probabilify  disfribufion  p  is  fhe  one  nafurally  generafed  by  fhis  algorifhm.  To  ereafe  fhe  elusfers  of  we 
use  a  deeomposifion  proeedure  whose  properfy  is  summarized  in  fhe  following  lemma. 


0.  Lef  Y  X,  p  E-  ^  for  eonsfanf  c  fo  be  fixed  lafer,  A  be  a  A/2-nef  of  X. 

1.  Piek  an  arbifrary  “roof”  vertex  v  EN  nof  pieked  before 

2.  Sef  fhe  initial  value  of  fhe  “radius”  L  E-  A/2 

3.  Flip  a  eoin  wifh  bias  p 

4.  If  fhe  eoin  eomes  up  heads,  goto  Step  11 

5.  If  fhe  eoin  eomes  up  fails,  ineremenf  L  by  F 

6.  IfL>A(l-l/4a) 

7.  ehoose  a  value  L  from  [0,  A/ (4a)]  u.a.r. 

8.  round  down  L  fo  fhe  nearesf  mulfiple  of  F 

9.  sef  L  i —  ^(1  —  l/4a)  TT 

10.  Else  goto  Step  3 

11.  Form  a  new  elusfer  C'  in  H"  eonfaining  all  elusfers  in  H'  fl  T  wifh  eenfers  lie  in  B(v,L) 

12.  Remove  fhe  verfiees  in  C'  from  Y 

13.  (Remark:  C'  has  radius  af  mosf  A  +  F) 

14.  If  T  /  0  goto  Step  1 

15.  End 


Eigure  3.1:  Algorithm  CUT-C LUSTERS 


Eemma  3.2.  Given  a  metric  {X,d)  with  a  T-ball  partition  H'  ofX  into  clusters  lying  in  balls  of  radius  at 
most  F  >  1,  and  a  value  A  >  SF,  there  is  a  randomized  algorithm  to  create  a  {KET)-ball  partition  H  "  of 
X,  where  each  cluster  ofH'  is  contained  in  some  cluster  ofH",  and  for  any  x  EX  and  radius  0  <  r  <  A, 

oi  r  ~\~  r) 

Pr[B(x,  r)  is  cut  in  H^'j  < - - — -a. 
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Proof.  Note  that  we  ean  assume  that  F  <  A/ca  and  A  >  a,  sinee  otherwise  the  lemma  is  trivially  true.  Using 
the  algorithmCUT-CLUSTERS  given  in  Figure  3.1,  we  ereate  a  partition  of  Y  (and  henee  of  Xf  all  distanees 
are  measured  aeeording  to  the  original  distanee  funetion  dinX. 

Let  us  define  dix  =  B(v,  r).  Note  that  if  ‘Bx  is  eut  in  IT"  due  to  some  value  of  L  from  v  G  A  (for  the 
first  time),  then  L  falls  into  the  interval  [d{v,x)  —  r  —  T^d{v^x)  +r  +  T].  Indeed,  if  Bx  is  eut  in  IT",  there 
are  at  least  two  elusters  Cj,C2  €  IT'  sueh  that  they  both  eut  Bx,  and  B(v,L)  eontains  one  of  their  eenters 
but  not  both.  Sinee  both  elusters  interseet  Bx,  their  eenters  c\  and  C2  are  at  distanee  at  most  r  +  F  from  x. 
If  L  <  d{v,x)  —  r  —  F,  the  triangle  inequality  implies  that  B(v,L)  eannot  eontain  either  eenter.  Similarly,  if 
L  >  d(v,x)  +  r  +  F,  B(v,L)  eontains  both  of  them.  Henee  the  value  of  L  must  fall  into  the  interval  indieated 
above. 

If  a  eut  in  Step  11-12  is  made  due  to  the  appearanee  of  a  heads  in  Step  4,  we  eall  sueh  a  eut  a  normal 
cut,  else  we  eall  it  n  forced  cut.  We  now  bound  the  probability  that  the  ball  Bx  =  B(x,  r)  is  eut  due  to  either 
type. 

Normal  cuts.  Consider  the  first  instant  in  time  when  the  parameter  L  for  some  root  v  €  A  reaehes  a  value 
sueh  that  the  eut  obtained  by  taking  all  H'  Cl  F  elusters  with  eenters  in  B{v,L)  would  eut  Bx.  (If  there  is  no 
sueh  time,  then  Bx  is  never  eut  by  a  normal  eut.)  In  this  ease,  L  must  also  be  in  the  range  d{v,x)  ±  (r  -f  F), 
and  inereases  with  time.  Now  either  (i)  we  make  a  normal  eut  before  L  goes  outside  this  range;  or  (ii)  we 
make  a  foreed  eut;  or  (iii)  L  goes  outside  the  range  and  we  make  no  eut  in  this  range.  In  any  ease,  the 
fate  of  Bx  is  deeided;  Bx  is  either  eut  or  eontained  in  a  new  eluster  with  eenter  v.  We  now  upper-bound  the 
probability  that  event  (i)  happens.  There  are  at  most  2(r+F) /F  eoin  flips  made  (wifh  bias  p)  when  fhe  value 
of  L  is  in  fhe  eorreef  range  of  widfh  af  mosf  2(r  +  F)  and  one  of  fhese  flips  musf  eome  up  heads  for  fhe  euf 
fo  be  made.  The  frivial  union  bound  now  shows  fhis  probabilify  fo  be  af  mosf  p  — 

Forced  cuts.  Lef  us  look  af  some  roof  v  G  A  and  bound  fhe  probabilify  fhaf  a  foreed  euf  is  made  wifh  euffing 
radius  L  from  v  in  some  range  ^  =  (i(v,x)  ±  (r+F).  Sinee  fhe  euf  is  foreed  and  fhe  value  of  L  is  greater  fhan 
A(1  —  1  /4a)  >  3A/4,  we  musf  have  flipped  a  sequenee  of  af  leasf  A/4F  sueeessive  fails;  fhe  probabilify  of 
fhis  evenf  is  af  mosf 

Now,  we  ehoose  L  fo  be  a  multiple  of  F  uniformly  in  a  range  of  widfh  af  mosf  A/4a,  and  henee  fhe 
probabilify  fhaf  L  falls  info  a  range  of  lengfh  2(r  +  F)  is  af  mosf  2(r+F) /(A/4a).  Multiplying  fhis  by  (3.1), 
we  obfain  a  bound  of  x  Mf+F  «  on  fhe  probabilify  fhaf  a  foreed  euf  is  made  around  v  wifh  L  in  fhe 
range  ^  sueh  fhaf  fhe  elusfer  C'  wifh  eenter  v  in  FI"  may  euf  Bx.  Finally,  for  any  x  ^X,  Bx  ean  only  be  euf 
by  elusters  from  roofs  v  €  A  fhaf  are  af  disfanee  af  mosf  (r  +  F)  +  A  <  3A  from  x;  by  Prop.  2.1,  fhere  are  af 
mosf  |B(x, 3A)  nA|  =  (]^)”  <  (12)“  of  sueh  roofs.  Now  we  ehoose  c  fo  be  large  enough;  fhe  probabilify 

of  Bx  being  euf  by  a  foreed  due  fo  any  sueh  roof  is  af  mosf  12“xe^3“  x  Mf+F  by  fhe  union 

bound.  ■ 

We  now  use  fhe  above  lemma  fo  prove  Theorem  3.2.  Using  IT'  =  IT,,  F  =  r],  <  P'(P/(P  -  1)).  and 
A  =  ri,_|_i  —  F  =  and  using  A  =  A+i  (whieh  is  a  p'+^/2  =  A/2  nef),  we  ereate  a  (F  + A  =  ri,+i)- 

ball  partition  sueh  fhaf  for  all  x  and  all  r  <  and  e  —  0{1  /a),  we  have 

Pr[B(x,sr)  euf]  <  a  <  ^  ^  (3.2) 

for  p /a  and  c  being  large  enough  eonsfanfs.  The  probabilify  disfribufion  p  over  all  deeomposifions  II  fhus 
generafed  satisfy  fhe  requiremenfs  of  a  PPHD  as  given  in  Definifion  3.1.  Finally,  we  bound  fhe  degree 
deg(q)  of  fhe  PPHD  q;  note  fhaf  eaeh  level-/  elusfer  is  eenfered  af  some  v  €  A„  henee  fhe  number  of  level-/ 
elusters  eonfained  in  some  level-(/+  1)  eluster  is  (2ri,_|_i/(p'/2))‘^F)  =  by  Prop.  2.1.  ■ 

Few  Hierarchical  Decompositions.  The  above  proof  immediately  gives  us  a  PPHD  pm  with  a  support 

on  only  M  =  0(logn  +  loglogA)  HDs.  By  sampling  from  the  distribution  p  for  M  times,  we  get  the 
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HDs  and  let  the  PPHD  be  the  uniform  distribution  on  these  HDs.  By  (3.2),  for  eaeh 

7  e  [1 .  ..M],  point  X  G  X  and  radius  r  <  p*,  B(x,sr)  is  not  eut  in  the  partition  ITp^  with  probability  1  /lO; 
henee  a  Chernoff  bound  implies  that  this  ball  is  eut  in  the  level-/  partitions  of  more  than  M/2  of  the  HDs 
with  probability  less  than  1  /(nlog  Now  taking  the  trivial  union  bound  over  all  possible  values  of  the 

eenter  x  gX,  and  all  the  log  A  values  of  r  whieh  are  powers  of  2  shows  that  the  /tm  is  a  (p,  s/2) -PPHD  whp. 

3.1.2  Even  Fewer  Hierarchical  Decompositions  While  the  proof  of  Theorem  3.2  and  the  diseussion 
above  do  not  produee  a  PPHD  with  small  support  (of  size  0 (a log  a)),  we  have  seen  all  the  essential  ideas 
required  to  prove  the  existenee  of  sueh  a  distribution  ju^  and  henee  to  eomplete  the  proof  of  Theorem  3.1. 
To  prove  this  result,  we  use  the  loeality  of  the  eonstruetion,  in  eonjunetion  with  the  Lovasz  Loeal  Lemma 
(LLL).  This  loeality  property  is  the  very  reason  why  we  built  the  hierarehieal  deeomposition  bottom-up;  it 
ensures  that  if  any  partieular  ball  is  not  eut  at  some  low  level  i  (the  “loeal  deeisions”),  it  is  not  eut  at  levels 
higher  than  i  (i.e.,  the  “non-loeal  deeisions”).  Also,  we  ehoose  the  deeomposition  proeedure  of  Theorem  3.2 
in  preferenee  to  others  (e.g.,  those  in  [12]  and  [23])  sinee  they  ehoose  a  single  random  radius  for  all  elusters 
in  one  partieular  partition  H  of  X,  whieh  eauses  eorrelations  aeross  the  entire  metrie  spaee.  (The  LLL  has 
been  used  in  similar  eontexts  in  [12,  17].) 

Proof  of  Theorem  3.1:  To  show  that  there  is  a  distribution  over  only  m  =  f?(aloga)  trees,  we  use  an 
idea  similar  to  that  in  the  previous  seetion,  augmented  with  some  ideas  from  [12].  Instead  of  building 
one  hierarehieal  deeomposition  11  bottom-up,  we  build  m  hierarehieal  deeompositions 
simultaneously  (also  from  the  bottom  up). 

(1)  (m) 

As  before,  the  proof  proeeeds  induetively;  we  assume  that  we  are  given  level-/  partitions  H  ■  , . . . , H-  \ 
where  Hp^  is  the  level-/  partition  belonging  to  We  then  show  that  we  ean  build  level- (/  -|- 1)  partitions 
, . . . ,  where  eaeh  Hp^  is  a  refinement  of  the  eorresponding  and  any  given  ball  B(x,sr)  with 

p'  <  r  <  is  eut  in  at  most  m/2  of  these  level-(/  -|- 1)  partitions.  We  start  off  this  proeess  with  eaeh 

np^  =  {{x}  :x  EX}  being  the  partition  eonsisting  of  all  singleton  points  in  X.  Let  7  =  { 1 , . . . ,  m}.  Given  m 
level-/  partitions  (Hppygy,  we  ereate  m  level-(/-|- 1)  partitions  (n.pj)yg7  using  the  proeedure  in  Lemma  3.2 
independently  on  eaeh  of  the  m  deeompositions;  parameters  are  set  as  in  the  proof  of  Theorem  3.2,  with 
A  =  p'^^  r  =  r\i,  and  s  =  l/0{a).  This  extends  the  m  hierarehieal  deeompositions  to  the  (/  +  1)'^*  level;  it 
remains  to  show  that  the  probability  of  balls  being  eut  is  small. 

To  deseribe  the  events  of  interest,  let  us  take  P  =  and  define  Z  fo  be  a  P-nef  of  X.  For  eaeh  zEZ, 
define  fo  be  B(z,2P),  and  fo  be  evenf  fhaf  is  euf  in  more  fhan  m/2  of  fhe  partitions 
whieh  we  refer  fo  as  a  “bad”  evenf  (used  in  Seefion  3.2).  We  prove  fhe  elaim  using  fhe  Lovasz  Loeal  Lemma. 

Claim  3.3.  Given  any  (npp^^j,  Pr[Aj;ez  >  0. 

Lemma  3.3.  (Lovasz  Local  Lemma)  Given  a  set  of  events  suppose  that  each  event  is  mutually 

independent  of  all  but  at  most  B  other  events.  Further  suppose  that,  for  each  event  Pr['E^'*“*]  <  p.  Then 
//ep(B+l)<l,Pr[A,6z^]>0. 

Proof  of  Claim  3.3:  Firsf,  lef  us  ealeulafe  fhe  probabilify  of  by  changing  fhe  consfanf  in  s,  we  can 

make  fhe  probabilify  fhaf  a  ball  is  euf  in  one  level-(/  +  1)  parfifion  fo  be  af  mosf  1  /8.  Lef  us  denote  by 
a{  fhe  even!  fhaf  is  euf  in  partition  .  The  expecfed  number  of  parfifions  in  which  fhe  ball  is  euf  is 
af  mosf  m/8.  Since  fhe  parfifions  are  consfrucfed  independenfly,  fhe  probabilify  for  fhe  evenf  fhaf  2?^ 
is  euf  in  m/2  parfifions  (which  is  af  leasf  four  times  fhe  expecfafion)  is  af  mosf  exp(— 9m/40);  fhis  can  be 
esfablished  using  a  sfandard  Chernoff  bound.  This,  in  furn,  is  af  mosf  (0.8)"*,  which  we  define  fo  be  p. 

Nexf  we  show  fhaf  an  evenf  is  mufually  independenf  of  all  evenfs  such  fhaf  d{z,z')  >  drji-i-i- 

For  each  parfifion  each  roof  v  €  Ni+i  determines  ifs  radius  by  conducfing  a  random  experimenf 
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independent  of  any  other  roots’  experiments.  These  random  experiments,  and  only  these,  determine  whether 
events  such  as  occur.  In  turn,  whether  event  occurs  is  determined  only  by  events  A^, . . .  ,A™.  For 
a  particular  j,  for  each  z,  all  of  the  cuts  that  could  affect  in  the  algorithm  Cut-Clusters  are  made 
from  roots  v  €  at  distance  at  most  2P  +  F  +  A  =  2P  +  <  2ri,_|_i  from  z.  Whether  event  A^  occurs 

is  determined  by  the  experiments  corresponding  to  these  roots  alone.  If  d{z,z')  >  4ri;_|_i,  then  there  is 
no  intersection  between  the  experiments  for  z  and  the  experiments  for  z'-  Since  is  determined  by 
A^, . . .  ,A™,  is  mutually  independent  of  the  set  of  all  such  that  d{z,z')  >  4ri,+i. 

We  apply  the  LLL  now.  Note  that  the  number  of  z^  €  Z  within  distance  4ri,_|_i  of  for  z  €  Z  is 
at  most  |B(z,4ri,_|_i)  nZ|  <  <  C?(a)“.  We  define  this  quantity  to  be  B;  ep{B  +  1)  is  at  most  1  for 

m  =  C?(aloga)  and  Claim  3.3  follows.  ■ 

Having  proved  the  claim,  let  us  now  show  that  with  nonzero  probability,  each  B(x,  r)  for  x  €  Z  and 
p'  <  r  <  p'+^  is  not  cut  in  at  least  m/2  of  the  level-(/  +  1)  partitions  Let  us  call  this  event  SCi+i. 

The  claim  shows  that  with  nonzero  probability,  each  ball  with  z  €  Z  is  not  cut  in  at  least  m/2  of  the 
partitions  Since  each  x  G  A  is  at  distance  at  most  P  to  some  Za:  G  Z,  the  triangle  inequality  implies 

that  B(x,£r)  C  B(x,  P)  is  not  cut  if  B(za,2P)  is  not  cut,  which  holds  in  at  least  half  of  the  partitions.  Hence 
SCi+i  also  holds  with  nonzero  probability. 

Finally,  we  prove  that  we  can  choose  a  random  set  of  HD’s  such  that  5'C,_|_i  occurs  for  each 

I  <  i  +  1  <  h  simultaneously  with  nonzero  probability.  The  key  to  the  proof  is  that  we  have  assumed  an 
arbitrary  (worst-case)  set  of  partitions  at  level  i  in  proving  a  nonzero  lower  bound  on  Pr[5'C,_|_i]. 

Hence,  we  can  ignore  any  dependence  among  the  events  5'C,_|_i  for  1  <  /  +  1  <  /i,  and  simply  multiply 
their  nonzero  probabilities  together  to  obtain  a  nonzero  lower  bound  on  the  probability  that  they  all  occur 
simultaneously.  ■ 

3.2  An  Algorithm  for  Finding  the  Decompositions  The  above  procedure  can  be  made  algorithmic  using 
an  approach  based  on  Beck’s  algorithmic  version  of  the  LLL  (see,  e.g.,  [1,  4]).  The  decomposition  satisfies 
all  properties  of  fhe  one  fhaf  is  shown  fo  exisf  using  LLL  in  Theorem  3.1,  allhough  wilh  some  changes  in 
conslanl  parameter  values.  As  in  fhe  proof  of  Theorem  3.1,  we  build  m  —  0(alog  a)  HDs  level  by  level  in 
a  bolfom-up  fashion. 

On  any  parlicular  level  /  +  1,  we  begin  by  choosing  m  partitions  al  random.  Afler  making  fhe  random 
choices,  we  examine  the  partitions  and  identify  all  of  the  bad  events  that  have  occurred.  We  then  group 
together  bad  events  that  may  depend  on  each  other,  as  well  as  “good”  events  that  may  depend  on  the  bad 
events.  Each  group  forms  a  connected  component  in  the  LLL  dependency  graph.  We  show  that,  with  high 
probability,  all  connected  components  have  size  O(logv),  where  v  =  |Z|  is  the  size  of  the  sp'+^-net  of  X. 

Once  the  groups  have  been  identified,  we  need  lo  eliminale  fhe  bad  evenls.  Hence,  for  each  group, 
we  “undo”  all  of  fhe  random  choices  concerning  fhaf  group,  while  nol  modifying  any  choices  fhaf  do  nol 
affecl  fhe  group.  New  choices  musl  be  made  for  each  group  so  fhaf  no  bad  evenl  occurs.  Because  fhe  group 
size  is  small  (fhe  number  of  cenlers  v  G  A,-|-i  concerning  fhe  group  fhaf  we  choose  random  radius  for  is  also 
C?(logv)),  we  can  find  new  sellings  for  Ihese  choices  using  exhauslive  search  in  polynomial  time. 

One  interesting  complicalion  in  Ibis  proof  is  fhaf  fhe  sel  of  cluslers  conlaining  a  group  have  differenl 
shapes  in  fhe  m  differenl  partitions.  In  each  parlilion,  we  cuf  oul  a  “hole”,  and  redo  fhe  choices  wilhin  fhe 
hole.  The  boundary  of  fhe  hole  is  formed  from  fhe  boundaries  of  fhe  cluslers  lhal  may  influence  Ihe  bad 
evenls  (and  Ihe  good  evenls)  in  Ihe  group.  In  forming  Ihe  boundary,  additional  good  evenls  may  be  added 
lo  Ihe  hole.  As  a  consequence,  if  is  possible  lhal  a  good  evenl  inside  a  hole  in  one  partition  may  appear 
inside  a  differenl  hole  in  anolher  partition.  Hence,  when  we  perform  exhaustive  search,  Ihese  holes  musl 
be  considered  logelher.  However,  our  melhod  of  bounding  Ihe  size  of  each  connected  componenl  already 
lakes  into  accounl  any  merging  of  holes  on  accounl  of  shared  good  evenls,  so  lhal  we  never  have  to  redo  Ihe 
choices  for  a  group  of  size  more  lhan  O(logv). 


Another  issue  is  that  the  subset  of  eenters  in  a  hole  that  belong  to  Ni^i,  the  p'+^/2-net  that  eovers  the 
entire  metrie,  may  not  by  themselves  eover  the  hole.  (Portions  of  the  hole  may  be  eovered  by  eenters  outside 
the  hole.)  So  for  eaeh  of  the  m  partitions,  we  may  have  to  add  additional  net  points  inside  the  hole  to  obtain 
a  eomplete  eover  for  it.  We  show  that  the  size  of  net  points  in  the  hole  inereases  by  only  a  eonstant  faetor 
and  remains  C?(logv),  and  the  degree  of  the  hierarehieal  deeomposition  trees  is  at  most  as  before. 

4  The  (1  +  x) -Stretch  Routing  Schemes 

Given  a  (p,s)-PPHD  with  a  support  on  m  HDs,  we  ean  now  define,  for  every  0  <  X  <  1,  a  (1  +x)-streteh 
routing  seheme  whieh  uses  routing  tables  of  size  at  most  m(a/x)^(“)  logAlogS  bits  at  every  node. 

We  eonsider  routing  sehemes  in  two  models.  In  a  basie  model,  we  assume  that  there  is  no  underlying 
routing  fabrie  and  eaeh  node  ean  only  send  paekets  to  its  direet  neighbors.  In  a  seeond  model,  we  ean  build 
an  overlay  hierarehieal  routing  seheme  upon  an  underlying  routing  fabrie  like  IP  that  ean  send  paekets  to 
any  speeifie  node  in  the  network.  We  speeify  the  routing  algorithm  in  the  basie  model,  but  also  indieate  how 
one  ean  eireumvent  eertain  steps  of  this  algorithm  when  an  underlying  routing  meehanism  is  given. 

Let  us  reeall  some  of  the  notation  defined  earlier.  Let  be  the  m  hierarehieal  deeompositions 

on  whieh  has  positive  support,  and  the  level-/  partition  eorresponding  to  be  ealled  np\  Reeall  that 
we  ean  assoeiate  eaeh  hierarehieal  deeomposition  with  a  tree  Tj  (as  outlined  in  Seetion  2.1).  Note  that 
eaeh  of  these  trees  has  a  deg(/rm)  bounded  by  and  a  height  of  at  most  h  =  logp  A  .  Reeall  that  eaeh 

internal  vertex  of  the  tree  Tj  at  level  i  eorresponds  to  a  eluster  of  ITp^  and  leaves  of  Ty,  Vj  €  J,  eorrespond  to 
vertiees  in  X,  where  7  =  {1, . . .  ,m}.  Let  eaeh  internal  vertex  v  of  eaeh  tree  Tj  label  its  ehildren  by  numbers 
between  1  and  deg(/rm);  v  does  not  label  anything  with  the  number  0,  but  uses  it  to  refer  to  its  parent.  Note 
that  this  allows  us  to  represent  any  path  in  a  tree  Tj  by  a  sequenee  of  at  most  2h  =  O (logp  A)  labels. 

Lemma  3.1  already  shows  that  the  m  trees  thus  ereated  form  a  small  0(p/s)  =  0(a^)-streteh  Steiner 
tree  eover,  whieh  ean  be  used  for  routing  purposes  (as  in  Seetion  4.3).  However,  sinee  sueh  a  large  streteh  is 
not  always  aeeeptable,  we  improve  on  this  seheme  in  the  following  subseetions  to  get  better  routing  bounds. 

4.1  The  Addressing  Scheme  Given  a  tree  Tj  and  a  vertex  x  E  X,  we  assign  x  a  local  address  addry(x), 
whieh  eonsists  oi  h  —  logp  A  bloeks,  one  for  eaeh  level  of  the  tree  Tj.  Eaeh  bloek  has  a  fixed  length. 

The  block  of  the  addry(x)  eorresponds  to  partition  ITp^  and  eontains  the  label  assigned  to  the  eluster  Cx 
eontaining  x  in  ITp^  by  C;c’s  parent  in  Tj.  Sinee  any  sueh  label  is  just  a  number  between  1  and  deg(qm), 
where  deg(qm)  =  we  need  O(aloga)  bits  per  bloek.  In  faet,  one  ean  extend  this  addressing  seheme 

to  any  eluster  C  in  Tj.  If  C  is  a  level-/  eluster,  the  -bloek  of  addry(C)  eontains  *’s  for  k  <  /;  addry(A)  for 
the  root  eluster  of  Tj  eontains  all  *’s  matehing  all  vertiees  in  X. 

The  global  address  addr(x)  of  point  x  €  A  is  the  eoneatenation  (addri(x),  •  •  •  ,addrm(x))  of  its  loeal 
addresses  addry(x)  for  j  E  J.  Sinee  eaeh  eluster  C  belongs  to  only  one  tree  Tj,  we  define  addry/(C)  fo  be  a 
sequenee  of  #’s  of  the  eorreet  length  (where  #  are  dummy  symbols  matehing  nothing),  and  henee  define  a 
global  address  of  C  as  well.  (This  is  only  for  simplieity;  in  aetual  implementations,  eluster  addresses  for  Tj 
ean  be  given  by  the  tuple  (addry(C),y).) 

Sinee  there  are  0  (a  log  a)  bits  per  bloek,  h  bloeks  per  loeal  address,  and  m  loeal  addresses  per 
global  address,  substitution  of  the  appropriate  values  gives  the  address  length  A  to  be  at  most  m  x  h  x 
[log(deg(qm))l  =  O(cxloga)  x  logpA  x  O(aloga)  =  O(a^logalogA)  bits. 

4.2  The  Routing  Table  For  eaeh  point  x  EX,  we  maintain  a  routing  table  Route;c  that  eontains  the 
following  information  for  eaeh  Tj,  \  <  j  <m\ 

1.  For  eaeh  aneestor  of  x  in  Tj  that  eorresponds  to  a  eluster  C  eontaining  x,  we  maintain  a  table  entry  for 
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c. 


2.  Moreover,  for  eaeh  sueh  C,  we  maintain  an  entry  for  eaeh  deseendant  of  C  in  Tj  reaehable  within  i 
hops  in  tree  Tj.  Here  I  =  0(logp  1  /ex),  with  the  eonstants  ehosen  such  that 

In  the  routing  table  Route^  for  x,  each  of  the  above  entries  thus  corresponds  to  some  level-/'  cluster  C'  in  Tj. 
Let  closex(C')  be  the  closest  point  in  C'  to  x.  (We  assume,  w.l.o.g.,  that  ties  are  broken  in  some  consistent 
way,  so  that  any  node  y  on  a  shortest  path  from  x  to  closeji;(C')  has  the  value  closey(C')  =  close;t(C');  in 
fact,  this  consistency  is  the  only  property  we  use.)  For  this  C',  Route^  stores  (a)  the  global  address  addr(C') 
by  which  the  table  is  indexed,  (b)  the  identity  of  a  “next  hop”  neighbor  y  of  x  that  stays  on  a  shortest  path 
from  X  to  the  closest  point  close;t(C')  in  C',  and  (c)  an  extra  bit  ValidPath;c(C'):  if  the  cluster  t  levels  above 
C'  in  Tj  is  the  cluster  C,  then  ValidPath_r(C')  is  set  to  be  true  if  B(x,£p'  is  entirely  contained  within 
cluster  C  and  (i(x,closex(C'))  <  sp'  and  is  set  to  be  false  otherwise.  Of  course,  if  we  reach  the  root  of 
Tj  while  trying  to  go  up  i  levels,  then  the  bit  is  set  to  be  true.  Note  that  if  there  is  an  underlying  routing 
fabric  like  IP,  we  can  store  the  IP-address  of  some  node  in  C'  (say,  the  closest  one)  instead  of  (b)  and  (c) 
above. 

Lemma  4.1.  The  number  of  entries  in  the  routing  table  Routejc  of  any  x  is  at  most  logA  x  (a/x)'^(“^. 

Proof  Let  us  estimate  the  number  of  entries  in  Route^  for  any  x  G  X.  There  are  m  trees.  For  each  tree  Tj, 
for  all  j  G  J,  there  are  h—  logp  A  ancestors  ofx  and  the  degree  of  the  tree  is  bounded  by  deg{fUm)  — 

Recall  that  p  and  1  /e  are  both  0{a),  and  hence  £  —  0(log(a/x)).  Plugging  these  values  in,  we  get  that  the 
number  of  entries  for  x  across  m  trees  is  at  most  mxhx  (deg(/rm))^  =  O(ocloga)  x  0(log„A)  x  = 

log  A  X  Each  entry  is  indexed  by  one  global  address  (of  at  most  A  =  O(a^logalogA)  bits,  which 

we  do  not  store  in  Route;c  since  we  can  deduce  it  from  addr(x)  based  on  the  clustering  structure);  each 
entry  indeed  contains  the  identity  of  the  next  hop  (which  uses  0(log5)  bits,  where  6  is  the  maximum  degree 
of  G),  a  path  length  field  (to  be  specified  in  Secfion  5.1),  and  one  additional  ValidPath  bif.  ■ 

The  forwarding  algorifhm  makes  use  of  fwo  funcfions,  NextHop;,.  and  PrefMatchjt.  For  a  poinf  x  and  a  level- 
/'  cluster  C'  in  Tj,  fhe  function  NextHop^(addr(C'))  refurns  fhe  nexf  hop  on  fhe  pafh  from  x  fo  close;c(C') 
provided  fhaf  fhe  nexf  hop  does  nol  leave  fhe  clusfer  C  af  level  /'  +  £  fhaf  confains  C',  and  null  ofherwise.  (As 
we  shall  see,  fhe  packef  forwarding  algorifhm  is  guaranfeed  never  fo  encounfer  a  null  nexf  hop.)  Given  poinfs 
X  and  t  in  X,  fhe  function  PrefMatch_t(t)  refurns  an  addr(C')  in  Route^  such  fhaf  in  some  Tj,  t  belongs  fo 
fhe  level-/  cluster  C',  ValidPath;c(C')  is  true,  and  fhe  value  /  is  the  smallest  across  all  frees.  Note  fhaf  bofh 
of  fhese  funcfions  can  be  compufed  efficienfly  by  node  x.  Furfhermore,  if  is  possible  fo  supporf  fhe  functions 
wifh  dafa  sfrucfures  of  size  comparable  fo  fhaf  of  Route^. 

Note  fhaf  once  fhe  poinfs  in  X  have  been  assigned  addresses  (for  which  we  have  described  only  an 
off-line  algorifhm),  fhe  roufing  fables  can  be  builf  up  in  a  completely  disfribufed  fashion.  In  particular,  a 
disfribufed  breadlh-firsf-search  algorifhm  can  be  applied  fo  determine  whefher  a  ball  of  a  cerfain  radius  is 
cuf  in  a  particular  decomposifion,  and  a  disfribufed  implemenfafion  of  fhe  Bellman-Ford  algorifhm  can  be 
used  fo  esfablish  fhe  nexf-hop  enfries  for  destinations  for  which  fhe  shorfesf  pafhs  lie  wifhin  a  cerfain  clusfer. 

4.3  The  Forwarding  Algorithm  The  idea  behind  fhe  forwarding  algorifhm  is  fo  sfarf  a  packef  off  from 
ifs  origin  s  towards  an  intermediate  clusfer  C  confaining  ifs  desfinafion  /;  fhe  packef  header  fhus  consisfs  of 
fwo  pieces  of  information  (addr(t),acldr(C)),  where  t  is  fhe  desfinafion  node  for  fhe  packef  and  C  is  fhe 
intermediate  clusfer  confaining  t.  Inifially,  fhe  clusfer  can  be  chosen  (degenerately)  fo  be  fhe  roof  clusfer  of 
(say)  free  Ti . 

Upon  reaching  a  node  x  in  fhe  infermediafe  clusfer  C,  a  new  and  smaller  infermediafe  cluster  C',  also 
confaining  t,  musf  be  chosen,  possibly  from  a  differenl  free;  fhe  packef  header  musf  be  updated  wifh  addr(C') 
fhaf  remains  fhe  same  unfil  reaching  C' .  Suppose  fhaf  fhe  new  clusfer  C'  confaining  t  is  af  level  /'.  Afler 
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selecting  this  cluster,  the  packet  is  sent  off  towards  C'  with  the  new  header,  following  a  shortest  path  that 
stays  within  the  cluster  C  at  level  i'  + 1  that  contains  both  v  and  C'.  This  process  is  repeated  until  ultimately 
the  packet  reaches  the  cluster  containing  only  the  destination  t.  The  algorithm  is  presented  in  Figure  4.2. 


1.  Let  packet  header  be  (addr(t),addr(C)). 

2.  If  C  contains  v,  the  current  node,  then 

3.  find  addr(C')  PrefMatch_t(t) 

4.  let  y  ^  NextHop^(addr(C')) 

5.  forward  packet  with  new  header  (addr(t), addr(C'))  to  y. 

6.  Else  (now  x^C) 

1.  let  y  ^  NextHop^(addr(C)) 

8.  forward  packet  with  unchanged  header  (addr(t) ,  addr(C))  to  y. 

9.  End 


Eigure  4.2:  The  Eorwarding  Algorithm  at  Node  v 
Theorem  4. 1 .  The  forwarding  algorithm  has  a  stretch  of  at  most  (1  +  x),  where  X  <  1. 

Proof  We  first  show  that  the  algorithm  is  indeed  valid;  each  of  the  steps  can  be  executed  and  the  packet 
eventually  reaches  t.  Suppose  that  the  packet  has  just  reached  a  node  x  in  an  intermediate  cluster  C  containing 
t  (with  addr(C)  in  its  header);  thus  x  needs  to  execute  Step  3  to  find  a  new  clusfer  C'  containing  t.  Clearly, 
PrefMatchx(f)  can  return  the  root  cluster  Croat  of  any  Tj,  since  it  contains  t.  We  show,  however,  that  the 
cluster  C'  returned  by  PrefMatch;c(t)  has  a  small  diameter  and  nodes  along  a  valid  shortest  path  from  x  to 
C'  will  forward  the  packet  correctly  until  it  reaches  C'. 

Eemma  4.2.  If  the  packet  is  at  node  x  with  distance  to  the  target  t  being  d{xf)  <  sp',  Step  3  must  return 
some  addr{C')  such  that  cluster  C'  3  t  is  at  level  {i  —  i)  or  lower  in  some  Tf  with  ValidPath;c(C')  being 
true.  Furthermore,  all  vertex  v  on  all  shortest  paths  from  x  to  closex{C’)  —  closey{C')  has  a  non-null 
NextHopy  ( addr{C' ) ) . 

Proof  The  (p,s)-PPHD  ensures  that  there  exists  at  least  one  tree  Tj  such  that  B(x,sp')  is  not  cut  in  the 
level-/  partition  ITp^;  let  Ccom  C  be  the  level-/  cluster  in  Tj  that  contains  B(x,£p').  Eet  C?  €  be 
the  level-(/  —  F)  cluster  in  Tj  containing  t.  The  ValidPath;t(Cf)  bit  must  be  true  since  B(x,sp')  C  Ccont 
in  np^  and  (i(x,close;c(C())  <  d{xf)  <  sp';  thus  PrefMatchjc  can  (and  may  indeed)  just  return  addr(C() 
given  no  “better”  choices.  However,  PrefMatch;^  always  finds  a  clusfer  C'  in  some  Tf,  at  the  lowest  level 
across  all  trees,  such  that  t  E  C ,  and  ValidPath;t(C')  is  true  in  Routes.  Eet  the  level  of  C'  be  /';  the  value 
i'  is  at  most  (/  —  F).  Now  Eet  C  €  TIpp^  be  the  cluster  i  levels  above  C'  €  Tip  ^  in  Tf  that  contains  both  x 
and  C' .  (Such  C  must  exist  at  level  i'  -\-  i  for  addr(C')  to  be  in  Route;^.)  We  know  that  B(x,sp' ■'“^)  C  C 
and  <i(x,close;c(C'))  <  sp' since  ValidPath;t(C')  is  true  in  Routes.  Thus  all  shortest  paths  from  x  to 
closex(C')  are  entirely  contained  in  C.  Hence,  the  NextHoPy(addr(C'))  pointer  at  any  node  v  on  one  of 
these  paths  must  be  non-null  since  all  shortest  paths  from  v  to  closev(C')  =  closex(C')  are  all  contained  in 
C,  the  cluster  i  levels  above  C'  in  Tj.  ■ 

It  remains  to  bound  the  path  stretch.  Consider  the  case  when  a  packet  is  sent  from  s  to  t.  Eet  C'  be  a  cluster 
at  level  /  —  i  returned  by  Step  3  of  the  forwarding  algorithm.  Note  that  if  the  level  /  <  then  C'  =  {/ }  and  we 
send  the  packet  directly  to  t  with  x  =  0.  Using  these  short  distances  as  the  base  case,  we  now  do  induction 
on  the  distance  from  5  to  t. 
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If  C'  is  a  non-trivial  cluster  containing  t,  then  we  go  on  a  shortest  path  from  s  to  some  vertex 
V  =  closei(C')  €  C .  Since  t  e  C',  d{s^v)  <  d{sd)-  Because  the  diameter  of  C'  is  at  most  2'X\i_i, 
d{vd)  <  2ri,_£  <  <  d{sd)-  (The  last  inequality  holds  because  if  >  d{sd),  then  PrefMatchi 

would  have  returned  a  cluster  at  a  level  lower  than  that  of  C'  by  Lemma  4.2.)  Hence,  we  can  apply  the 
induction  hypothesis  to  find  a  path  from  v  to  t  of  length  at  most  {\  +  x)d{vd)  <  (1  +x)2ri,_£.  The  path  from 
5  to  f  as  derived  from  Route^  is  of  length  at  most  d{s,v)  +  (1  +  x)(i(v,f)  <  d{s,l)  +  (1  +  x)2ri,_£.  The  stretch 
of  the  path  from  5  is  t  is  then  1  +  (1  +x)2ri,_£/<i(5,t).  This  quantity  is  at  most  l  +  x  since  X  <  1  and  we  have 
chosen  constants  so  that  ri;_^  <  /A.  ■ 

5  Routing  Table  Construction  and  Path  Characteristics  for  (1  +  s) -Stretch  Routing 

The  hierarchical  routing  scheme  we  are  going  to  describe  in  this  section  is  a  completion  of  what  is 
lacking  in  Section  4;  hence  we  focus  primarily  the  process  of  building  up  routing  tables  using  a  distributed 
implementation  of  Bellman-Ford  algorithm  for  the  base  model  that  we  introduce  in  Section  5.1.  For  overlay 
routing,  we  store  the  IP  address  of  an  intermediate  node  to  reach  each  destination  in  the  routing  tables  and  the 
process  of  routing  table  updates  are  similar  to  that  of  prefix  routing,  e.g.,  in  [14].  Allhough  Ihe  Forwarding 
algorilhm  remains  Ihe  same  as  lhal  in  Seclion  4.3,  we  will  elaborale  in  more  delails  on  ils  behavior  in 
Section  5.2  when  if  is  coupled  wilh  Ihe  new  routing  algorilhm. 

Our  routing  scheme  is  similar  in  spiril  lo  lhal  of  Closesl  Enlry  Routing  (CER)  scheme  described 
in  Kleinrock  and  Kamoun  (KK)  [16].  They  define  a  hierarchical  routing  scheme  by  lirsl  specifying  an 
“optimal”  underlying  hierarchical  clustering  slruclure  lhal  Ihey  impose  on  Ihe  nelwork  nodes,  where  Ihe 
optimization  objective  is  lo  minimize  Ihe  routing  fable  lenglh;  each  level-k  cluster  is  defined  recursively  as 
a  sel  of  level- (k  —  1)  clusters,  wilh  Ihe  level-0  clusters  being  individual  nodes.  This  leads  nalurally  lo  a  free 
represenlalion  as  shown  in  Eigure  5.3  (a),  where  internal  free  nodes  represenl  clusters;  Eigure  5.3  (b)  shows 
lhal  Ihe  destination  addresses  in  Ihe  routing  fable  of  node  A  corresponds  lo  clusters  al  differenl  levels  of  Ihe 
decomposition  free,  hence  reflecting  Ihe  slruclure  of  Ihe  hierarchical  clustering  of  nelwork  nodes.  In  KK, 
Iwo  nodes  share  common  routing  lable  enlries  for  all  Ihe  clusters  lhal  conlain  bolh  of  Ihem.  KK  assumes 
lhal  all  clusters  al  Ihe  same  level  have  Ihe  same  number  of  sub-clusters  wilhin  Ihem,  and  each  cluster  is 
a  connected  component  The  KK  hierarchical  routing  procedure  leads  a  message  down  a  free  palh,  fixing 
more  prefix  digils  al  each  step,  much  as  prefix  routing,  Iraversing  smaller  and  smaller  clusters  lhal  conlain 
Ihe  destination  node  until  il  reaches  Ihe  destination  ilself. 


(a)  A  free  represenlalion  (b)  Routing  lable  enlries  in  node  A 


Eigure  5.3:  A  4-level  hierarchical  clustering  slruclure  of  nelwork  nodes 

The  reduction  of  routing  lable  size  generally  leads  lo  an  increase  in  nelwork  palh  lenglh.  In  order  lo 
derive  bounds  on  Ihe  increase  in  Ihe  average  palh  lenglh,  Ihey  furlher  assume  lhal  a  shorlesl-palh  belween 
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two  nodes  in  a  cluster  lies  within  the  cluster.  They  also  prescrihe  an  upper-hound  of  dk  on  the  (strong) 
diameter  of  a  level  cluster,  with  dt  decreasing  as  k  decreases.  They  show  that  routing  schemes  based 
on  the  hierarchical  clustering  model  cause  essentially  no  increase  in  the  average  network  path  length  for 
a  family  of  large  distributed  networks.  Specifically,  the  networks  they  consider  are  all  connected  graphs 
upon  which  it  is  possible  to  fit  a  hierarchical  clustering  whose  outcome  satisfies  fhe  assumptions  above.  In 
addition,  (a)  fhe  resulfing  clusters  af  any  level  satisfy  fhe  following:  fhe  diamefer  of  any  clusfer  S  chosen  is 
bounded  above  by  0(|5'|'')  for  some  consfanf  v  G  [0, 1],  and  (b)  fhe  average  disfance  befween  nodes  in  fhe 
nefwork  is  where  N  is  fhe  size  of  such  a  nefwork. 

In  confrasl,  our  hierarchical  roufing  schemes  give  bounds  on  fhe  pafh  sfrefch  on  a  per  node-pair  level 
on  cerfain  nefworks  fhaf  are  connecfed  graphs  G,  where  fhe  nafural  mefric  {X^d)  induced  by  shorfesf  pafh 
disfances  befween  any  pair  of  nodes  in  G  is  a  doubling  mefric.  In  addition,  fhe  main  improvemenf  our  work 
over  fhaf  of  KK  is:  while  fhe  KK  routing  scheme  is  based  on  assumpfions  regarding  fhe  exisfence  of  a  “good” 
partition  of  fhe  nefwork,  fhe  mefhod  ifself  does  nof  provide  an  algorifhm  for  compufing  such  a  partition; 
we  are  able  fo  prove  fhe  exisfence  of  a  (p,s)-PPHD  wifh  a  supporf  on  m  Hierarchical  Decompositions 
and  acfually  find  fhem  by  following  fhe  Clustering  algorifhm  and  ifs  consfrucfive  algorifhm  described  in 
Section  3.  Nofe  fhaf  while  we  guarantee  a  degree  bound  for  fhe  decomposifion  frees  across  all  levels,  we  do 
nof  require  fhey  are  exacfly  fhe  same. 

If  would  be  ideal  if  once  we  consfrucf  such  a  sef  of  nefwork  parfifions,  we  can  run  fhe  hierarchical 
roufing  algorifhm  specified  in  KK  af  each  individual  decomposition  free.  However,  if  is  nof  possible  fo 
direcfly  apply  KK’s  roufing  scheme  or  fheir  proof  fechniques  for  fhree  reasons.  Firsf,  while  KK  assumes 
fhaf  each  clusfer  subnefwork  is  fully  connected.  Ibis  is  nof  safisfied  in  our  decomposition.  Second,  fhe 
shorfesf  pafhs  befween  fwo  nodes  in  a  clusfer  are  nof  guaranfeed  fo  sfay  wifhin  fhe  clusfer.  Finally,  allhough 
fhe  maximal  disfance  in  G  befween  verlices  of  Ck,  for  all  0  <  k  <  /i,  is  bounded  wifhin  fhe  diamefer  of 
Ck,  2r[k,  which  is  geomelrically  decreasing  as  k  decreases,  if  is  a  weak  diamefer  bound  and  nof  necessarily 
safisfied  by  fhe  disfance  induced  by  fhe  subgraph  corresponding  fo  each  clusfer  Ck- 

We  Ihus  adopl  as  many  definitions  and  nolalions  as  possible  from  KK  in  fhis  section  while  invenfing 
some  new  fechniques  for  addressing  fhe  above  issues  in  fhe  design  and  specification  of  a  modified 
hierarchical  routing  scheme  given  a  (p,s)-PPHD  pm  wifh  a  supporf  on  m  HDs  and  in  fhe  analysis  of  fhe 
characteristics  of  pafhs  as  induced  by  fhe  roufing  fables  fhus  created.  The  imporfanf  properly  of  a  (p,£)- 
PPHD  fhaf  we  will  use  in  defining  our  routing  scheme  is  fhaf,  for  p*“^  <  r  <  p*,  Ihere  is  af  leasl  one  free  Tj 
such  fhaf  B(5,sr)  is  confained  in  a  level  i  clusfer  C,-  in  fhe  level-/  parlilion  Hp^;  since  a  ball  is  a  connected 
componenl,  all  shorfesf  pafhs  from  s  fo  verlices  wifhin  B(5,sr)  musl  be  confained  wifhin  Ci  in  fhe  level-/ 
partition  Hp^ . 

5.1  Routing  Table  Update  Rules  In  this  section,  we  focus  on  the  process  of  building  up  routing  tables 
once  the  nodes  in  the  network  have  been  assigned  addresses  that  reflect  their  positions  in  each  of  the  m 
decomposition  trees.  During  this  process,  routing  information  is  aggregated  and  exchanged  between  special 
nodes  in  different  clusters  at  each  level.  We  refer  to  such  special  nodes  as  exchange  nodes  (for  routing)  or 
entry  points  (for  packet  forwarding)  of  their  corresponding  clusters.  The  algorithm  for  selecting  exchange 
nodes  for  each  cluster  is  an  independent  issue  that  we  do  not  address  in  this  paper.  Similar  to  the  CER 
hierarchical  routing  scheme  described  in  KK,  no  routing  information  describing  the  internal  behavior  of 
a  cluster  is  propagated  outside  a  cluster;  hence  a  cluster  is  regarded  from  outside  as  a  single  node  whose 
distance  to  itself  is  zero. 

We  use  a  modified  version  of  the  distributed  Bellman-Ford  algorithm  as  in  Fig  5.4  to  perform  routing 
updates:  especially,  to  establish  the  next-hop  entries  and  update  estimated  path  lengths  for  destination 
clusters  in  the  routing  tables  for  the  basic  model.  For  routing  updates,  we  are  going  to  focus  on  entries 
for  one  specific  decomposition  tree  Tj  that  corresponds  to 
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Let  s  and  t  be  two  neighboring  nodes  (that  they  are  eonneeted  by  a  ehannel  {s^t))  which  belong  to 
the  same  level  cluster  Q  G  and  not  to  any  lower  level  cluster  in  Tj,  where  k  G  {1,2, . . .  ,/j}.  Let 
Ck-\{s),Ck-i{t)  €  respectively  denote  the  k—  L*  level  clusters  to  which  s  and  t  each  belong  in  tree 
Tj.  Let  Ck{s^t)  denote  the  level-^  cluster  that  contains  both  s  and  f,  note  that  Cic-\{s)^Ck-i{t)  C  Ck{s,t)  in 
Tj  since  Tj  represents  a  laminar  decomposition.  We  use  lca^(5,t)  to  denote  the  lowest  common  ancestor  of 
5  and  t  in  a  particular  tree  Tj-,  hence  lca^(5,t)  =  Ck{s,t)  €  For  a  pair  of  nodes  s,  t,  \ca^ {s,t)  can  be 
determined  by  inspecting  the  common  prefixes  of  local  addresses,  addry(5)  and  addry(t). 

Recall  that  in  node  5,  for  any  cluster  C,  (5)  in  Tj  that  contains  5  at  level  i,  for  all  /  =  0, . . . ,  h,  routing  table 
entries  are  kept  for  all  clusters  that  are  descendants  of  Ci{s)  E  within  £  levels  down  a  decomposition  tree 
for  Tj,yj.  Thus  each  entry  in  the  routing  table  Route^  for  Tj  corresponds  to  some  level-(/')  cluster  C'  E 
in  Tj,  where  i'  =  0,1, ...  ,h  —  1;  that  entry  is  also  denoted  as  C'  and  indexed  by  the  global  address  addr(C') 
of  its  associated  cluster  C',  and  contains  the  following  fields  in  Routes:  (a)  a  next  hop  NextHop^(addr(C')) 
to  reach  C'  from  s,  (b)  a  path  length  field  HF(5,C')  fhaf  is  fhe  currenf  pafh  lengfh  af  node  s  for  reaching 
cluster  C'  fhrough  NextHop^(addr(C')),  and  (c)  a  ValidPath^(C')  bif.  Initially,  fhe  pafh  lengfh  fields  for 
all  fhe  enfries  in  Route^  for  free  Tj  are  sef  fo  00  excepf  for  fhe  self  enfries  as  shown  in  fhe  Initialization 
Procedure  in  Fig  5.4. 

We  use  Ci(s,C')  =  Ci(s)  E  FIp^  fo  denote  fhe  level-/  common  ancesfor  of  s  and  C'  E  such  fhaf 
i  >  /'  +  1  and  Ci{s)  D  C'.  Note  fhaf  Ch{s,C')  =  Ch{s)  confains  C'  E  for  all  i'  <  h  —  1,  since  Ch{s) 
confains  fhe  enfire  nefwork.  Similarly,  we  use  lca^(5,C')  fo  denote  fhe  lowesf  common  ancesfor  of  s  and 
C  E  in  free  Tj,  where  C'  C  \ca^{s,C')  C  Ci{s,C')  for  all  i  such  fhaf  C'  C  Ci{s).  For  node  5  and  cluster 
C',  fhe  lca^(5,C')  can  be  determined  by  inspecting  fhe  common  prefixes  of  local  addresses  addry(5)  and 

addry(C')- 

As  a  consequence  of  fhe  routing  fable  specification,  routing  fable  enfries  af  node  s  and  t  af  all  levels 
below  k  —  £  in  Tj  refer  fo  differenl  clusfer  desfinafions;  whereas  all  fhe  ofher  enfries  from  level  k  —  £upto 
h  refer  fo  fhe  same  clusfer  desfinafions  in  Tj.  The  objective  of  fhe  updating  procedure  is  fo  compare  fhe 
esfimafed  lengfhs  of  fhe  pafhs  from  5  or  /  fo  any  common  desfinafion  and  fo  updafe  fhe  roufing  fables  fo 
rellecl  fhe  shorfer  pafhs.  Whenever  s  receives  a  roufe  updafe  from  t,  for  each  common  desfinafion  clusfer 
C' ,  ifs  corresponding  enfry  is  pofenfially  updafed  wifh  a  new  nexf  hop  NextHop^(addr(C')),  fhe  pafh  lengfh 
HF(.  ,C')  fhrough  fhe  new  NextHop^(addr(C'))  as  in  Step  2-4,  and  fhe  ValidPath^(C')  bif  as  in  Step  5-9  of 
fhe  Roufe  Updafe  Procedure  in  Fig  5.4. 

We  have  a  slighfly  differenf  way  of  seffing  fhe  ValidPath^(C')  bif  from  fhaf  specified  in  Section  4.2  fo 
maximize  fhe  chance  of  seffing  if  true.  However,  as  before,  once  fhe  ValidPathi(C')  bif  is  sef  fo  be  true,  a 
shorfesf  pafh  from  s  fo  C'  is  indeed  guaranteed  by  following  fhe  nexf  hop  in  Route^  for  an  enfry  C'  and  fhaf 
in  Routev  of  each  subsequenf  nodes  v  along  fhe  pafh  from  s  fo  an  enfry  poinf  of  C' . 

Lef  a  common  desfinafion  enfry  for  Tj  in  Route^  and  Routej  correspond  fo  a  level-(/')  clusfer  C'  E 
where  i'  >k  —  £.  We  denote  fhe  level  of  \ca^ {s,C')  in  Tj  as  Iq,  The  following  inequalities,  i'  +  l  <Iq<  i'  +  £, 
musf  be  satisfied  for  C'  fo  be  an  enfry  in  Routes.  The  ValidPath^(C')  bif  is  sef  fo  be  true  so  long  as  for 
“any”  of  fhe  common  ancesfor  Ci{s,C')  of  s  and  C'  af  level  i,  for  all  Iq  <i  <  i'  +  £,  bofh  HF(5,C')  <  sp'  and 
B(5,sp')  C  Ci{s,C')  are  frue.  If  is  sef  fo  be  false  ofherwise.  Note  fhaf  when  i'  >h  —  £,  bofh  HF(5,C')  <  A 
and  B(5,  A)  C  Ch{s,C')  are  always  frue  since  Ch{s,C')  is  fhe  entire  nefwork;  hence  we  sef  ValidPath^(C')  bif 
true  for  all  C'  af  level  h  —  £  and  above  in  Sfep  5  of  fhe  Inifializafion  Procedure. 

The  reason  we  sef  ValidPath^(C')  bif  fhis  way  is  fhe  following.  Recall  fhaf  by  consfrucfing  fhe  m 
decomposition  frees,  each  node  s  “knows”  if  B(5,sp')  is  confained  Ci{s)  E  in  free  Tj;  nafurally,  if 

B(i',sp')  C  Ci(s)  E  npP  fhen  B(5,sp')  C  Ci(s)  E  Hp^  is  frue  for  all  I  >  i.  However,  if  B(5,sp*)  (Z!  Ci{s), 
we  do  nol  assume  fhaf  we  know  information  such  as  “whefher  a  ball  B(5,  r)  of  a  radius  sp'  >  r  >  is 
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Initialization  Procedure:  initialize  Route^  for  tree  Tj  at  node  s 

1.  For  /  =  0, 1, . . .  ,/j 

2.  HF(5,C,(5))  =  0,  and  ValidPath^(C,(5))  =  true 

3.  For  all  other  entries  C'  ^  s,  let  /'  =  level  of  C'  in  tree  Tj 

4.  HF(5,C')=- 

5.  If  /'  >h  —  £,  then  ValidPathi(C')  =  true 

6.  End 

Route  Update  Procedure:  upon  receiving  a  route  update  from  t  such  that  lca^(5,t)  =  Q 

1.  For  each  common  entry  C'  €  which  represents  a  level-(/')  cluster  in  Tj,  where  i'  >k  —  l 

2.  If  HF(5,C')  >  +  HF(t,C'),  then 

3.  HF(5,C') +  HF(t,C') 

4.  nexthop  field  of  C'  t 

5.  If  i'  <h  —  I,  then 

6.  Let  Zo  =  level  of  lca-'(5,C')  in  Tj  and  m  satisfies  <  HF(5,C')  <  sp™ 

7.  for  all  levels  i :  max{Zo,  nr}  <i  <i'  +  £ 

8.  If  B(5,sp'”)  C  B(5,sp')  C  Ci{s)  in  Tj,  fhen 

9.  ValidPath^(C')  =  true 

10.  Goto  1 

11.  End 


Eigure  5.4:  Distributed  Bellman-Eord  Algorifhm  for  Tj  af  Node  5 


confained  in  Ci{s)  or  nof”,  since  fhaf  is  nol  fhe  type  of  informalion  fhaf  our  consfrucfive  algorifhm  provides 
by  defaulf;  note  fhaf  if  r  <  we  willjusf  check  if  B(5,sp('^^))  C  C,_i(5)  to  decide  if  B(5,r)  C  Ci{s). 

Our  roufing  algorifhm  fhus  makes  minimal  assumptions  abouf  fhe  informalion  fhaf  is  available  al  each  node 
aboul  balls  around  if  being  confained  al  a  cerlain  level  or  nol. 

Anolher  specificafion  in  terms  of  routing  that  is  different  from  that  of  Section  4.2  is  the  following. 
Assume  we  route  a  packet  from  s  toward  C'.  Instead  of  assuming  the  packet  should  always  enter  a  cluster 
C'  through  the  closest  point  x  =  01056^(0')  in  C'  to  s,  we  only  require  that  the  packet  enters  C'  through  a 
closest  entry  point  cq  £  C.  Correspondingly,  for  node  s  and  a  level-(Z')  cluster  C'  E  Flj/^  in  Tj,  the  function 
NextHop^(addr(C'))  returns  the  next  hop  on  the  path  from  s  to  eg  provided  that  the  next  hop  does  not  leave 
the  cluster  C  at  level  (/'  +  £)  that  contains  C',  and  null  otherwise.  Recall  an  entry  point  eo  E  C  advertises 
routes  for  C'  it  belongs  to.  Note  also  eo  does  not  need  to  be  the  closest  one  to  s  in  C'  in  order  to  achieve 
(1  +x)-stretch  routing.  (This  is  also  true  for  overlay  routing.)  As  a  basic  routing  scheme,  we  keep  a  next  hop 
NextHop^(addr(C'))  in  Route^  toward  a  closest  entry  point  eo  E  C  for  the  sake  of  routing  table  consistency 
that  we  will  elaborate  shortly. 

Eor  overlay  routing,  we  keep  the  IP  address  of  an  arbitrary  entry  point  eo  to  C'  (instead  of  a  next  hop 
NextHop^(addr(C'))  toward  eo),  since  IP  routing  will  deliver  a  packet  from  s  to  eo  directly  given  the  IP 
address  of  eo  without  having  to  rely  on  hop-by-hop  forwarding  as  in  the  basic  model  that  we  focus  in  this 
section. 

DeeinitioN  5.1.  We  call  a  path  an  internal  path  in  cluster  C  if  all  the  nodes  in  that  path  belong  to  C. 

Similar  to  KK,  we  define  fhe  equilibrium  condifion  as  fhe  sifuafion  when  no  changes  occur  in  fhe 
topology  of  nefwork  and  fhe  confenfs  of  HF(5,C')  in  fhe  routing  fable  reach  “minimal”  consfanf  values  after 
a  cerlain  number  of  updales. 
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Claim  5.2.  The  distributed  Bellman-Ford  algorithm  guarantees  that  in  equilibrium  condition,  HF{s,C') 
will  be  the  length  of  the  shortest  path  from  s  to  a  closest  entry  point  eo  of  C  when  ValidPath^(C')  is  true, 
i.e.,  HF{s  ,C')  =  d{s,eo)  in  Routes- 

Proof  Let  the  level  of  C'  G  Llj/^  in  tree  Tj  he  i'  <  h  —  i  and  let  the  level  of  \C3l^{s,C')  be  Zq-  We  only 
set  Valid  Pathi(C')  true  in  the  routing  algorithm  when  for  “any”  of  the  level-/  eluster  Cfs)  €  where 
Zq  <  Z  <  Z'  both  HF(5,C')  <  sp'  and  B(5,sp')  C  Cfs)  €  hold.  Denote  the  lowest  sueh  level  r,  where 
r  €  [lof'  +  ^]-  All  shortest  paths  from  s  to  some  entry  point  x'  €  C'  of  distanee  d(s^x')  <  HF(5,C^)  <  sp'- 
are  thus  internal  to  Cj-{s)  G  in  Tj,  sinee  sueh  paths  are  eontained  in  B(5,sp''),  whieh  is  a  eonneeted 
eomponent  entirely  eontained  in  Cr{s)  €  ITp^.  Note  that  some  x'  €  C  must  have  advertised  itself  as  an  entry 
point  to  C'  for  sueh  paths  to  be  established  within  {Cr(5)  —  C'}  and  for  C'  €  Llj/^  to  appear  in  Route^.  Thus 
C  C  Cr{s)  sinee  x'  G  {C'  nCr(5)}  7^  0  and  r  >  Zq;  we  thus  denote  Cr{s)  as  Cr(s,C')  from  this  point  on. 

In  addition,  every  node  v  €  Cr(s,C'),  ineluding  those  along  the  shortest  paths  from  s  to  x'  inside 
B(5,sp'‘),  eontains  a  routing  table  entry  to  C',  sinee  it  is  a  deseendant  of  Cr(s,C')  within  £  levels  down 
the  deeomposition  tree  Tj.  Propagation  and  subsequent  updating  of  routing  information  among  nodes  of 
Cr{s,C')  is  equivalent  to  finding  minimum  path  internal  to  Cr(s,C')  from  any  node  v  E  {Cr(s,C')  — C'}  to  an 
entry  point  of  C'  that  is  elosest  to  node  v;  for  s,  the  elosest  entry  point  to  C'  is  eo- 

Improvements  are  made  sequentially  at  eaeh  update  over  the  distanee  HF(m,C')  from  u  to  C'  among 
nodes  within  B(5,sp'^),  until  it  reaehes  a  minimal  eonstant  value  if  no  ehanges  oeeur  in  the  topology  of 
the  network;  henee  all  u  E  B(5,sp'‘)  “knows”  how  to  route  to  C'  with  a  path  of  bounded  length.  Given 
multiple  entry  points  to  C',  the  distributed  Bellman-Ford  algorithm  guarantees  that  we  find  a  shorfesf  pafh 
nof  only  fo  some  enfry  poinf  x'  of  C',  buf  also  fo  fhe  elosesf,  eo  of  C',  from  s  in  equilibrium  eondifion,  i.e., 
HF(.  ,C')  =  d{s^eo)-  The  entire  pafh  slays  wilhin  B(5,sp'')  C  Cr(s,C'),  where  r  is  speeified  as  above. 

Note  lhal  when  i'  >h  —  I,  bolh  HF(5,C')  <  A  and  B(5, A)  C  Ch{s-,C')  are  always  Irue  sinee  Ch{s-,C')  is 
fhe  enlire  nelwork;  henee  we  sef  ValidPathi(C')  true  for  all  C'  al  level  h  —  £  and  above.  The  same  argumenl 
as  above  applies  fo  Ihis  ease.  ■ 

The  reason  we  require  a  elosesf  enfry  poinf  fo  C'  is  primarily  for  roule  eonvergenee  purpose  when  our 
profoeol  serves  as  an  underlying  routing  seheme.  For  overlay  routing,  we  allow  an  enfry  poinf  fo  be  any 
exehange  node  or  simply  a  random  node  wilhin  fhe  elusler,  whieh  is  eommonly  assumed  in  peer-lo-peer 
nelworks.  Nole  fhal  an  exehange  node  of  a  given  eluster  is  a  node  of  lhal  eluster  which  is  connected 
fo  one  or  more  nodes  external  fo  lhal  elusler  as  defined  in  KK.  We  will  use  exchange  node  and  enfry 
poinf  interchangeably  unless  we  specify  olherwise.  The  (1  -f  x)-slrelch  properly  we  are  going  fo  prove 
for  hierarchical  rouling  palhs  does  nof  require  fhe  enfry  poinf  for  a  cluster  C'  fo  be  fhe  closes!  fo  s  eilher  -  a 
poinf  lhal  we  will  nof  elaborafe  on  from  now  on. 

Fact  5.3.  If  a  shortest  path  from  s  to  eo,  an  entry  point  to  a  level-{i')  cluster  C'  E  is  internal  to 
Ci{s)  E  in  tree  Tj,  where  i  >  i',  then  cluster  C'  3  eo  must  be  a  sub-cluster  that  is  entirely  contained  in 
Q(5)  in  Tj,  i-e-,  C  C  Cfs)- 


Proof  Firsl  observe  eo  E  {C,(5)  flC'},  since  shorfesf  pafh  from  5  fo  eo  is  internal  fo  Cfs)  E  ITp^  in  Tj-  Since 
Tj  represenls  a  laminar  decomposilion,  where  a  lower  level  elusler  is  always  entirely  conlained  in  a  higher 
level  cluster,  eo  E  C,  is  sufficienl  fo  guarantee  lhal  {C'  3  t}  C  Ci-  ■ 

5.2  Path  Characteristics  We  forward  packets  according  to  the  Forwarding  algorithm  in  Figure  4.2.  Let  s 
and  t  be  two  arbitrary  nodes.  For  destination  t,  let  C'{t)  be  the  cluster  whose  addr(C'(z))  is  returned  by  the 
function  PrefMatch^(t)  at  Step  3  of  the  Forwarding  algorithm.  We  assume,  w.l.o.g.,  C'{t)  E  FIpP  i.e.,  C'(t) 
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is  in  the  level-(/')  partition  where  i'  <h  —  I,  in  tree  Tj.  Reeall  that  h  =  I  logp  A 


.  Let  lQ<hhe.  the  level 


oflca-'(5,C'(0)  inTy. 

We  say  C'{t)  3  t  is  the  eluster  that  has  the  longest  valid  prefix  matehing  with  t  in  Routes,  sinee  the 
level  of  C'{t)  is  the  lowest  aeross  all  trees  among  elusters  C'  in  Routes  sueh  that  C  3t  and  ValidPath^(C') 
is  true.  Before  we  proeeed,  we  first  give  more  definitions,  some  of  whieh  are  adapted  from  KK. 

/ijj!  Length  of  the  estimated  minimum  path  from  node  s  to  node  t  as  derived  from  the  routing 
information  at  node  s.  (The  superseript  c  stands  for  elustered  routing.) 

Exehange  node  e{.  a  node  of  a  eluster  C  that  is  eonneeted  to  one  or  more  nodes  external  to  C. 

A,(t):  Subset  of  all  exehange  nodes  (entry  points)  that  eonneet  a  level-/  eluster  C,(/)  €  in  tree  Tj, 
for  all  7  =  1, . . .  ,/n,  with  any  other  level-/  eluster  within  the  same  aneestor  C„(/)  €  in  the  same  tree  Tj, 
for  all  n  <i  +  (..  From  the  above  definitions,  all  entry  points  of  C'{t)  €  that  eonneet  C'{t)  to  any  other 
level-(/')  eluster  that  stays  within  C/_|_£(/)  G  in  tree  Tj  henee  belong  to  Aii{t). 

Let  cq  e  A,/(t)  nC'(/)  be  the  elosest  entry  point  for  s  to  reaeh  C'{t)  E  in  Tj. 

Ck{s,t)\  For  k<h—\,  Ck{s,t)  €  is  the  level-^  eluster  in  Tj,  where  lo<k<  i'  +  £,  that  is  the  lowest- 
level  eommon  eluster  of  5  and  1  sueh  that  B(5,sp^)  C  Ck{s,t)  and  B(5,sp^)  eontains  a  shortest  path  from  5 
to  C'(/)  E  in  Tj,  where  /'  <  /i  —  £',  sueh  Ck{s,t^  E  always  exists  sinee  we  know  B(5,£p  n^Crisd) 
and  HF(.,  C'{t))  <  sp'^  must  both  hold  for  some  lo  <  r  <  i'  + 1,  given  that  ValidPath^(C'(/))  is  true  in 
Routes,  due  to  the  speeifieation  of  the  distributed  Bellman-Ford  algorithm.  Fet  k  be  the  lowest  sueh  level 
r.  Note  that  C'{t)  C  Ck{s,t)  sinee  Tj  represents  a  laminar  deeomposition  and  k  is  at  least  Zq.  For  k  =  h, 
Ck{sd)  —  Ch{sd)  G  is  the  root  eluster  X  of  Tj  that  eorresponds  to  the  entire  network  G.  In  this  ease, 
Ck{s,t)  =  Ch{s,t)  always  eontains  all  shortest  paths  from  s  to  C'{t)  E  Llj/^  in  Tj,  where  i'  =  h  —  i,  given  that 
G  is  a  eonneeted  graph. 

h\g^{t)\  Fength  of  the  shortest  path  from  node  s  to  an  exehange  node  E  A,/(t)  nC'(/)  as  eontained 
in  Ck{s,t)  defined  above.  The  superseripf  /  sfands  for  an  infernal  pafh  wifhin  Ck{s,t).  At  equilibrium, 
Keoil)  —  HF(5,C'(t))  =  d{s,eo)  sinee  fhe  shorfesf  pafh  from  s  fo  eo  is  infernal  fo  Ck{sd),  and  by  Claim  5.2, 
HF(.  =  d{s,eo)  in  Routes  given  fhaf  ValidPath^(C'(/))  is  true  in  Route^  and  eo  is  fhe  elosesf  enfry 
poinf  fo  C'(/)  for  node  s.  Reeall  fhaf  HF(5,C'(/))  is  fhe  eurrenf  pafh  lengfh  filed  in  Route^  for  node  s  fo 
reaeh  C'{t)  E  via  ifs  eurrenf  NextHop^(addr(C'(/))).  Nofe  when  fhe  shorfesf  pafh  from  s  fo  is  nof 
infernal  fo  Ck{s,t),  we  denote  if  wifh  —  oo. 

In  order  fo  reaeh  t,  funelion  PrefMatch^(t)  is  ealled  by  fhe  Forwarding  algorifhm  af  node  s,  whieh 
looks  aeross  Routes  for  all  frees  and  pieks  a  free  Tj  fhaf  eonfains  C'{t)  wifh  a  elosesf  enfry  poinf 
eo  E  A,/(t)  nC'(t).  Node  5  fhen  stores  (addr(t),addr(C'(t)))  in  fhe  paekef  header  and  sends  fhe  paekef 
fo  NextHop^(addr(C' (/)));  fhe  paekef  header  remains  fhe  same  while  intermediate  nodes  v  forward  fhe 
paekef  along  a  shorfesf  pafh  from  s  fo  eo,  thaf  is  eonfained  in  fhe  eommon  eluster  Ck{s,t)  of  s  and  t  in  Tj, 
unfil  if  reaehes  eo- 

The  key  observation  we  have  regarding  a  pafh  from  5  fo  t  is  fhe  following.  The  pafh  may  nof  be 
eonfained  wifhin  fhe  lowesf  common  ancesfor  lca^(5,  t)  E  of  s  and  t  in  a  particular  free  Tj.  However,  fhe 
segmenf  from  5  fo  C'(/),  is  eonfained  wifhin  Ck{s,t)  in  Tj,  where  lo  <k  <i'  +  i,  when  following  a  shorfesf 
pafh  from  s  fo  eo,  which  is  fhe  elosesf  enfry  poinf  fo  C'(/).  Recall  Ck{sd)  is  a  common  cluster  of  s  and  C'{t)  af 
a  level  higher  fhan  fhaf  of  lca^(5,  /).  Concepfually,  we  route  packefs  from  slot  wifhin  Ck{s,  t)  E  fo  avoid 
being  stock  in  lca^(5,/)  E  which  may  nof  confain  any  pafh  (e.g.,  when  \ca-’ {s,l)  in  Tj  is  disconnected) 
or  eonfains  only  very  long  pafhs  from  s  to  t.  The  shorfesf  pafh  from  s  fo  eo  is  fhus  an  infernal  pafh  relafive  to 
Ck{s,t),  which  we  denote  wifh 

Finally,  We  define  a  consfanf  (|)  =  ^  thaf  we  will  use  fhroughouf  Ibis  seefion.  If  is  easy  to  verify 

P  ^ 
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that  2ri,_£  <  <  (|)£ph  Recall  that  I  —  0(logp  l/£x)  and  p  =  where  we  choose  suitable 

constants  so  that  p^  <  is  satisfied.  The  rest  of  this  section  is  dedicated  to  the  proof  of  the  main  theorem 
of  this  section,  before  which  we  first  prove  two  lemmas  regarding  the  level  of  C'(f)  and  Ck{s,t)  given  d{s,t). 
Note  that  we  always  have  k  <h  and  i'  <h  —  £.  We  will  ignore  the  case  when  k  —  h  until  the  end  of  this 
section. 

Lemma  5.1.  Let  d{s,t)  <  (1  “<lt)£p^  where  \  <i  <h.  The  cluster  C'{t)  €  in  Tj  that  has  the  longest 
valid  prefix  matching  with  t  with  ValidPath^(C'(i))  =  true,  is  at  a  level  i'  <  max  (0, /  —  .£);  the  common 
cluster  Ck{sfi)  €  as  defined  above  that  contains  the  shortest  path  from  s  to  C'{t)  is  at  level  k  <  i. 

Proof.  We  first  prove  the  lemma  when  i  <  i  with  the  following  claim. 

Case  i  <  L 

Claim  5.4.  Let  £p'^^  <  d{sf)  <  £p'/or  1  <  /  <  L  Then  C'{t)  =  Co(f)  is  t  itself;  the  lowest  common  cluster 
Ck{sf)  such  that  B(5,sp^)  C  Ck{sf)  anr/  B(5,sp*^)  contains  the  shortest  path  from  s  to  Co{t),  i.e.,  t  itself  is 
at  level  k  —  i. 

Proof  Node  5  has  a  routing  table  entry  for  all  i  such  that  <i (5,  t)  <  sp^,  since  B(5,<i(5,t))  C  B(5,sp^)  is  fully 
contained  in  some  level-.^  cluster  C£{s)  €  in  some  tree  Tj,  and  C'(t)  is  Co(f)  € 

The  properties  of  the  (p,s)-PPHD  ensure  that  there  is  at  least  one  tree  Tj  such  that  B(5,sp')  C  Cfis)  € 
in  Tj.  Since  d(s,t)  <  £p',  we  know  that  t  €  B(5,sp')  and  Co(t)  C  Ci(s)  in  Tj.  The  lowest  common 
cluster  Ck{s,t)  such  that  B(5,sp^)  C  Ck{s,t)  and  B(5,sp*)  contains  the  shortest  path  from  5  to  C'{t)  —  Co(f), 
i.e.,  t  itself,  is  Cfis)  G  ITp^  in  tree  Tj  and  k  =  i.  ■ 

We  now  prove  the  general  case  when  i  >  £. 

Case  h  —  1  >  i  >  L  Let  x'  €  be  an  arbitrary  entry  point  to  some  level-(/  —  £)  cluster  C  3  t 

in  some  tree;  hence  d{x'f)  <  2ri,_£  <  (|)sp'  since  x' f  G  C.  Applying  the  triangle  inequality,  we  have 
d{s,x')  <  d{sf)  -\-d{t^x')  <  £p';  thus  all  shortest  paths  from  s  to  x' ,  for  all  x!  E  are  contained  in 

B(s,ep‘). 

The  properties  of  the  (p,£)-PPHD  ensure  that  there  is  at  least  one  tree  Tq  such  that  B(5,sp')  is  not  cut 
in  the  level-/  partition  ITpp  let  Cfs)  E  be  the  level-/  cluster  in  Tq  such  that  B(5,sp')  C  Cfis).  Since 

d{sf)  <  (1  —  (|))£p‘,  we  have  t  E  B(5,sp')  C  Cfis)  E  ITp^.  Let  Ci-e{t)  G  be  the  level-(/  — .£)  cluster 
in  Tq  containing  /;  we  know  that  C  Q(s),  since  t  E  {C,_f(/)  nC,(5)}  and  Tq  represents  a  laminar 

decomposition.  Hence  we  have  Cfis)  =  Q(t)  =  Cfisf)  in  the  level-/  partition  in  tree  Tq. 

The  ValidPathi(C,_£(/))  bit  must  be  set  true  in  Route^  by  the  distributed  Bellman-Ford  algorithm  in 
node  5,  since  (a)  B(5,sp')  C  Cfisf)  E  flpP  and  (b)  HF(5,C,_£(/))  <  £p'  in  Route^  for  entry  Ci-e{t)  E 
in  tree  Tq  at  equilibrium,  given  that  all  shortest  paths  from  s  to  an  entry  point  x' ,  for  all  x'  E  A,_£(t)  nC,_£(/), 
are  internal  to  B(5,sp*).  Thus  PrefMatchi(/)  can  (and  may  indeed)  just  return  addr(C,_f(/))  given  no 
“better”  choices,  in  which  case,  /'  =  i  —  I  and  k  <  i. 

However,  PrefMatch^(/)  always  finds  a  clusfer  C'{t)  E  Hp^  at  the  lowest  level  across  all  trees,  such 
that  t  E  C'{t)  and  ValidPath^(C'(t))  is  true  in  Route^;  hence  C'{t)  is  at  level  /'  <i  —  £. 

We  know  that  B(5,sp'‘)  C  Cr(s,t)  E  Hp^  and  HF(5,C'(t))  <  £p'’  must  both  hold,  for  some  lo<r<  i' +£, 
in  order  for  ValidPath^(C'(/))  bit  to  be  true,  due  to  the  specification  of  the  distributed  Bellman-Ford 
algorithm.  Let  k  be  the  lowest  such  r;  we  have  k<i'  +  £<i  for  /  >  1. 

Case  /  =  h.  We  have  k  <h  and  i'  <h  —  £  trivially,  since  both  holds  for  all  possible  distances  of  d{sf)  up  to 
A,  which  is  the  diameter  of  the  network  G.  ■ 
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Claim  5.5.  When  C'{t)  e  is  at  level  1,  d{s^t)  >  sp^. 

Proof.  Prove  by  contradiction.  Assume  that  d{sf)  <  £p^.  By  Claim  5.4,  we  have  C'{t)  —  Co(t), 
contradicting  the  assumption  that  C'{t)  is  at  level  1.  ■ 

Lemma  5.2.  Let  C'{t)  €  be  the  cluster  returned  by  function  PrefMatchs{t)  at  Step  3  in  the  Forwarding 
algorithm  and  its  level  be  i' ,  where  h  —  l>i'  >  1.  Then  d{s,t)  >  (1  —  (|))sp''‘’“^^^ 

Proof  Prove  by  contradiction.  Assume  that  d{sf)  <  (1  —  (|))sp' By  lemma  5.1,  the  cluster  C'(t)  3  t 
that  has  the  longest  valid  prefix  matching  with  t  with  the  ValidPath^(C'(f))  bit  set  true  in  Routes  is  at  level 
at  most  —  1,  thus  contradicting  the  assumption  that  C'{t)  is  at  level  ■ 

We  next  prove  the  following  lemma  regarding  the  level  of  C'{t)  given  the  level  of  Ck{s,t). 

Lemma  5.3.  Let  a  level-k  cluster  Ck{sf)  G  in  tree  Tj,  where  h—  \>k>£,  be  the  lowest-level  common 
cluster  of  s  and  t  such  that  a  shortest  path  from  s  to  C'{t)  G  Llj/^  is  contained  in  B(5,sp*^)  C  Cjfsf)  G 
Then  C'{t)  €  is  at  either  level  k  —  lor  level  k  —  I 

Proof  Be  definition  of  Ck{s,t),  we  know  that  Iq  <  k  <  i'  -\- 1  and  Zq  >  i'  +  1^  where  Zq  is  the  level  of 
Ica^  (5,  C'(z)).  Thus  k  —  l<i'<k  —  \.  The  lowest  level  that  C'(t)  can  be  is  at  k  —  I,  and  we  argue  that 
C'(t)  can  not  be  at  a  level  higher  than  k  —  l-\-l. 

Let  eo  be  a  closest  entry  point  to  C'(t)  for  s,  such  that  eo  G  A,/(t)  nC'(t)  and  the  shortest  path  from  s 
to  eo  is  internal  to  6(5, sp*^)  C  Cifsf)  €  hence  d{s^eo)  <  £p*^.  Since  C'{t)  is  at  least  one  level  below 
Ck{s,t)  in  Tj  and  eof  €  C'(z),  we  have  d{eo,t)  <  2ri^_i.  Note  that  C'(z)  C  Ck{s,t)  by  Fact  5.3.  Applying  the 
triangle  inequality,  we  have  rZ(5,t)  <d{s,eo)-\-d{eo,t)  -\-2r\k-i. 

Now  we  examine  the  distance  of  d{s,y')  for  all  y'  E  Ak-£+i  (t).  Given  that  d{t^y')  <  Tpk-t+i,  we  apply 
the  triangle  inequality  and  obtain: 

d{s,y')  <d{sf)Fd{t,y')  <d{s,eo)+d{eof)+d{t,y')  <  £p^  +  2ri<:_i +2riA:_£+i  <  £p*^+\ 
where  1>1  and  p  =  0(^). 

Thus  all  shortest  paths  from  s  to  y' ,  for  all  y'  E  A<;_£_|_i(t),  are  contained  in  B  (5,£p*^+').  The 
properties  of  the  (p,£)-PPHD  ensure  that  there  is  at  least  one  tree  Ty  such  that  B(5,£p(*^+'))  C  C;t+i(5)  € 
Let  Ck-i+i{t)  C  ^k\+i  level-(k  — .£+ 1)  cluster  that  contains  t  in  Ty.  Given  that  t  E 

B(5,£p*^+^)  C  Cj(:+i(5),  we  know  that  Ck-£+i{t)  ^  Q+i(‘5')  C  since  t  E  {Ck-£+i{t)  nC^+i(5)}  7^  0  and 

i  /M 

Ty  represents  a  laminar  decomposition.  Thus  C^_£+i(t)  E  must  appear  in  s"  routing  table  with 

ValidPath^(Cjt-£+i(z))  set  true,  since  Ck-t+i{t)  C  Cjt+i(5)  is  within  I  levels  below  C,(:+i(5)  in  Ty  and  all 
shortest  paths  from  s  to  Ck-t+\{t)  are  contained  in  B(5,£p*^+')  C  C<:+i(5)  in  Ty. 

Thus  the  level  i'  of  C'(z)  must  satisfy  k  —  l<i'<k  —  l-\-l  for  C'(z)  E  to  be  returned  by 
PrefMatch^(t).  ■ 

Fact  5.6.  When  k  =  h  and  Ck{sf)  E  is  Ch{sf)  E  which  is  the  entire  network  G,  we  know  that 
C'{t)  E  is  at  level  i'  —  h  —  l  —  k  —  l. 

The  next  lemma  shows  the  path  characteristics  from  5  to  t  up  till  entry  point  eo  of  C'{t)  E  n|/^. 
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Lemma  5.4.  All  messages  to  be  forwarded  or  sent  from  node  s  to  node  t  will  follow  the  same  shortest  path 
up  to  the  closest  entry  point  eo  ofC'{t)  €  to  s.  The  shortest  path  from  s  to  eo  is  internal  to  Ck{sf)  E 
in  Tj;  it  has  a  length  ofh^^g^  that  satisfies: 

Keo  =  fnirie.eA,,  (t)na(t)  {Ke, } ,  (5-3) 

where  i'  is  the  level  of  C'{t)  E  and  k  <  i'  +  £,  and  Ck{sf)  E  Aift),  and  h\g^  are  as  defined 
above,  and  h\g  =  oo  when  the  shortest  path  from  s  to  e^  is  not  contained  in  Cifsf).  At  equilibrium, 
Keo  =  HF{^  ,C'(f))  =  d{s,eQ).  Finally,  all  vertices  v  on  the  shortest  path  from  s  to  e^  have  a  non-null 
NextHop^{addr{C' {t)))  and  share  the  same  closest  entry  point  eg  to  cluster  C'{t). 

Proof  By  the  definition  of  Ck{sf),  iov  k  <  h  —  we  know  that  Ck{sf)  E  is  the  level-k  eluster, 
where  Iq  <k  <i'  + 1,  in  tree  r,-,  that  is  the  lowest-level  eommon  eluster  of  s  and  t  sueh  that  B  (5,sp^)  c 
Ck{sf)  E  and  B(5,sp*^)  eontains  a  shortest  path  from  5  to  C'(t)  E  in  Tj;  speeifieally,  B(5,sp*^) 
eontains  a  shortest  paths  from  s  to  eo,  and  d{s,eo)  <  £p*^.  By  Faet  5.3,  we  have  C'{t)  C  Ck{sf).  When 
k  =  h,  Ck{sf)  =  Ch{sf)  E  is  the  root  eluster  X  and  naturally  eontains  all  shortest  paths  from  s  to 
C'{t)  C  Ch{sf),  given  that  G  is  a  eonneeted  graph. 

All  the  nodes  in  C/i(s,t)  E  eontain  one  entry  for  C'(t)  E  Llj/^  in  their  routing  tables,  sinee  k  <i'  +  £ 
and  C'(t)  C  Ck{sf)  is  a  eluster  within  I  levels  below  Ck{sf)  E  in  tree  Tj.  Propagation  and  subsequent 
updating  of  routing  information  among  nodes  of  Ck{sf)  E  in  Tj  is  equivalent  to  finding  fhe  minimum 
pafh  infernal  fo  Ck{sf)  from  any  node  u  E  {Ck{sf)  — C'(f)}  fo  an  enfry  poinf  of  C'{t)  fhaf  is  elosesf  fo  node 
m;  for  s,  fhe  elosesf  enfry  poinf  fo  C'{t)  is  eo  sueh  fhaf  h\g^  =  }.  Henee,  af  equilibrium, 

eo  is  on  fhe  minimal  pafh  from  5  fo  C'(t)  and  h\g^  =  HF(5,C'(f))  represenfs  fhe  lengfh  of  sueh  minimal  pafh. 

All  shorfesf  pafhs  of  lengfh  d{s,eo)  from  5  fo  eo  are  infernal  fo  Ck{s,t)',  when  k  <  /i  —  1,  if  is  wifhin 
B(5,sp^)  C  Ck{sf)  E  Henee,  af  equilibrium,  wifhin  B(5,sp^)  E  C/c(s,t)  E  for  k  <  h  —  1,  or  wifhin 
Ck{sf)  —  X  for  k  —  h,  n  shorfesf  pafh  of  lengfh  h'^^^  —  d{s,eo)  is  formed  befween  s  and  eo  among  nodes 
wifhin  a  eonneefed  eomponenf,  fhaf  share  a  eommon  enfry  for  C'{t)  C  Ck{s,t)  in  fheir  roufing  fables.  Thus 
we  have  HF(5,C'(f))  =  =  d{s,eo)- 

For  any  node  v  on  one  of  fhese  shorfesf  pafhs  from  s  fo  eo,  s  and  v  musf  share  fhe  same  elosesf  enfry 
poinf  eo  to  C'{t)  at  equilibrium,  due  to  the  exeeution  of  the  distributed  Bellman-Ford  algorithm;  furthermore, 
intermediate  nodes  v  will  be  able  to  route  the  paeket  toward  C'{t)  E  in  Ck{sf)  eonsistently  sinee  they 
eaeh  eontain  an  entry  for  C'{t)  E  with  a  non-null  NextHop„(addr(C'(f)))  field,  given  fhaf  fhese  pafhs 
sfay  wifhin  Cifsf)  E  FIp\  where  k  <  i'  A  £.  The  Forwarding  algorifhm  will  forward  messages  from  node  s 
desfined  fo  node  t  along  fhe  shorfesf  pafh  fhus  formed  fo  firsf  reaeh  C'{t)  in  free  Tj.  ■ 

The  proeess  of  finding  fhe  nexf  enfry  poinf  repeafs  by  fhe  time  fhe  paekef  reaehes  eo,  an  enfry  poinf  fo 
C'{t)  E  in  free  Tj,  unfil  fhe  paekef  reaehes  ifs  desfinafion  t.  For  example,  eo  seleefs  a  new  free  Ti  fhaf 
eonfains  fhe  nexf  elusfer  C"{t)  E  wifh  a  longer  prefix  mafehing  wifh  t  fhan  C'{t),  and  updafes  fhe  paekef 
header  wifh  C''{t)  aeeordingly.  Nofe  fhaf  C"{t)  and  C'{t)  may  belong  fo  fwo  differenf  frees;  henee  while 
infermediafe  nodes  befween  one  enfry  poinf  and  anofher  never  swifeh  frees,  upon  reaehing  an  enfry  poinf,  if 
is  free  fo  swifeh.  The  nexf  lemma  slates  fhe  upper  bound  on  fhe  level  i"  of  C"{t)  E  Tlj,?,  and  fhe  level  of  fhe 

common  elusfer  Ck{xf)  E  fhaf  eonfains  a  shorfesf  pafh  from  x  fo  C''{t)  in  B  {xAp’^)  ECk{xf)  EUf  Af 
X  is  t  ilself,  we  are  done  wifh  forwarding. 

Lemma  5.5.  Let  1  <i'  <h  —  £be  the  level  ofC'{t)  E  Once  the  packet  from  s  reaches  an  entry  point  x 
in  Aift)  nC'(f),  including  eo,  x  will  find  a  new  leveT{i")  cluster  C"{t)  E  at  level  i"  <  max  (0,/'  —  2)  in 
some  tree  Ti,  and  the  common  cluster  Cjfx,  t)  E  as  defined  above  is  at  a  level  k  <i'  —  2  + i. 
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Proof.  We  have  d{x,t)  <  2ri/  <  sinee  x  G  A,v(f)  nC'(f)  is  an  entry  point  to  some  level-(/')  eluster 

C'{t)  €  containing  t.  We  have  d{xf)  <{\—  (|))sp' so  long  as  which  can  be  satisfied  when 

suitable  constants  are  chosen  for  I  =  ©(logp  1/ex)  and  p  =  0(^).  Lemma  5.1  tells  us  that  k'  <i'  —  2  +  i  and 
C"{t)  e  is  at  level  <  max  (0,/'  —  2).  ■ 

We  are  now  ready  for  the  main  theorem  that  summarizes  the  path  properties. 

Theorem  5.1.  Follow  the  Forwarding  algorithm  in  Section  4.3,  for  all  k  <  h,  the  path  from  s  to  t  as 
derived  from  the  routing  information  at  node  s  satisfies  the  recursive  equation  below,  =  Keo~i~Kot’  where 
the  shortest  path  h^^^from  s  to  eg  is  contained  in  Ck{s,t)  and  its  properties  are  as  specified  in  Lemma  5.4. 
Secondly,  the  lookup  path  has  a  stretch  of  at  most  (1  +x).  Finally,  the  algorithm  switches  trees  for  at  most 
max  (0,^  —  ^+1)  times.  When  d{s,t)  <  (1  —  (|))ep”,  where  n  <h,  we  have  k  <  n;  otherwise,  k  <h. 

Proof.  The  proof  of  the  theorem  is  by  induction  on  k,  which  is  the  level  of  the  lowest  common  cluster 
Ck{sf)  of  s  and  C'{t)  such  that  a  shortest  path  from  s  to  C'{t)  is  contained  in  (a)  B(5,sp^)  C  Ck{sf)  for 
k  <h—l,  or  m  (b)  Ck{sf)  =  Ch{sf)  for  k  =  h.  Recall  /'  is  the  level  of  C'{t)  €  and  oq  €  A/(t)  nC'(f) 
is  the  closest  entry  point  to  C'{t)  for  node  s  within  Ck{sf)  € 

Base  Case:  k<^—l. 

We  first  prove  the  following  claim. 

Claim  5.7.  If  Ck{sf)  is  at  level  k  <1—1,  then  C'{t)  =  Co(f)  and  d{s,t)  <  sp^“^ 

Proof.  By  the  definition  of  Ck{sf),  we  know  B(5,sp^)  C  Ck{sf)  €  in  Tj  and  d(5,eo)  <  ^9^,  where  eo  € 
C'{t)  is  the  closest  entry  point  to  C'{t)  for  node  s.  Thus  d{s,  eo)  <  sp^^^  for  k  <  £  —  1.  Since  C'(f)  G  is  a 
descendant  of  Ck{sf)  G  in  Tj,  it  must  be  at  a  level  lower  than  k;  hence  d(eo,f)  ^  ^  4p^^^, 

since  eo,t  E  C'(t),  and  C'(t)  is  at  level  i'  <i  —  2. 

Applying  the  triangle  inequality,  we  have  d{sf)  <  d{s,eo)  +d{eod)  <  sp^“^  +4p^“^  <  sp^.  Thus  by 
Claim  5.4,  we  have  C'{t)  —  Co(t),  which  is  t  itself;  furthermore,  eo  —  t  and  d{sf)  —  d{s,eo)  <  sp^^^  ■ 

The  above  claim  shows  that  Ck{s,t)  G  in  tree  Tj  contains  a  shortest  path  from  5  to  C'(f)  =  Co(t)  G 
nl/^,  and  t  is  the  closest  entry  point  to  Co(t),  which  is  t  itself.  Thus  h^^^  =  /i/,  =  0,  since  a  node’s  distance 
to  itself  is  zero.  It  remains  to  show  that  h^j  =  h^^^  =  h\p,  recall  h\f  refers  to  the  shortest  path  from  5  to  t 

as  included  in  Ck{sf).  This  is  true  since  the  routing  table  of  every  node  v  in  Ck{sf)  G  for  k  <  ^  —  1 
contains  an  entry  for  Co(f)  =  t,  and  a  shortest  path  from  5  to  t  is  contained  in  B(5,sp*^)  C  Ck{sf)  in  tree  Tj\ 
hence  at  equilibrium,  the  clustered  path  between  s  and  t  as  derived  from  Route^  is  the  shortest  path  from 
s  to  t,  and  it  is  internal  to  Ck{sf),  i.e.,  h^^  =  HF(5,Co(f))  =  h^t  =  d{sf),  where  k  <i  —  \.  The  stretch  is 
exactly  1  since  —  1.  The  forwarding  algorithm  does  not  switch  tree  at  all. 

Case  k  =  L  By  Lemma  5.3,  C'{t)  is  at  level  0  or  1.  When  C'{t)  is  at  level  k  —  £  =  0,  the  proof  is  the  same  as 
that  in  the  base  case. 

When  C'{t)  is  at  level  k  —  ^+1  =  1,  we  have  d{s, t)  >  sp^  by  Claim  5.5.  All  messages  to  be  forwarded 
or  sent  from  node  s  to  node  t  will  first  follow  the  same  shortest  path  of  length  =  d{s,eo),  that  is  internal 

to  Q(5,t),  up  to  the  closest  entry  point  eo  of  C'{t),  as  specified  in  Lemma  5.4. 

Upon  reaching  eo,  Lemma  5.5  can  be  applied  fo  show  fhaf  h^^j,  fhe  clusfered  pafh  from  eo  to  t,  is  entirely 
contained  in  a  level-(k')  cluster  Cyt'(eoA)  iri  some  tree  Tji,  where  k'  <i'  —  2  + i  =  i—f]  thus  as  proved  in 
the  base  case,  h^^^  —  h^^i  —  d{eo,t).  The  clustered  path  from  5  to  t  as  derived  from  Route^  indeed  satisfies 
Kt  =  Keo  +  Koo  where  =  d{s,eo)  and  =  d{eo,t). 
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Hence  we  obtain  the  bound  on  the  entire  path:  =  d{s,eo)  +d{eod)  <  d{sd)  +2r/(eo,0’  where 

d{s,eo)  <  d{s,t)  +d{eod)  by  triangle  inequality.  And  the  path  stretch  is:  =  1  +  <  1  +  < 

1  +x,  where  i>\  +  (logp4/sx).  The  algorithm  switches  trees  at  most  once. 

Case  k>l+\.  First  we  assume  that  the  theorem  is  true  up  to  ^  —  1,  let  us  show  that  it  is  true  for  k. 

Let  Ck{sd)  €  be  the  k'^  level  cluster  that  contains  a  shortest  path  from  s  to  C'(t)  E  in  Tj. 
According  to  Lemma  5.4,  all  messages  to  be  forwarded  or  sent  from  node  s  to  node  t  will  first  follow  the 
same  shortest  path  of  length  that  is  internal  to  Ck(s,t),  up  to  the  closest  entry  point  eo  of  C'(t).  By 
Lemma  5.3,  we  know  that  C'(t)  is  at  level  k  —  £ork  —  £+l  when  k  <  h  —  1.  When  k  —  h,  C'{t)  is  at  level 
h  —  £  =  k  —  £. 

Upon  reaching  eo,  Step  3  of  the  forwarding  algorithm  is  applied  to  find  C"{t),  fhaf  has  fhe  longesf  valid 
mafching  wifh  t  in  go’s  roufing  fable.  Since  C'{t)  is  af  level  k  —  £ovk  —  £+\.  Lemma  5.5  shows  fhe  lowesf 
common  cluster  Q/(eo,f)  of  ^0  und  t,  such  fhaf  B(eo,£p^')  U  Ck'{eo,t)  und  B(eo,£p*^')  confains  a  shorfesf 
pafh  from  eo  to  C"{t),  is  at  level  k'  <i'  —  2  + £<k—\.  Thus  is  known  from  the  induction  hypothesis 
3.11(1  =  h}  -\- 

aiiu  T 

Now  we  proceed  to  prove  the  bound  on  the  stretch  for  level  k.  Let  C'{t)  be  at  level  ^  —  £,  where  P  is 
k  or  k+  1;  hence  d{eo,t)  <  given  that  eo,t  E  C'{t).  By  Lemma  5.2,  d{sd)  >  (1  —  (|))sp^“^  where 

/'  =  P  —  .£  >  1  for  all  k  >  .£  +  1 . 

By  Lemma  5.4,  is  the  shortest  path  from  s  to  eo  that  is  internal  to  Ck{sd)  and  =  d{s,eo)', 
applying  the  triangle  inequality,  we  obtain: 

Keo  =  ‘i{s,eo)  <  d{s,t)  +  d{eo,t)  <  disj) +  2r\p_i.  (5.4) 


By  the  induction  hypothesis,  we  know 

K,t  <  {l  +  T)d{eo,t)  <2(l  +  x)rip_^.  (5.5) 

Finally,  we  get  the  bound  on  the  total  path  length  from  s  to  t: 

Kt  =  Keo  ~^Kot  ^  +  2(2  +  x)r|p_£  (5.6) 


Now  using  the  fact  that  d{sd)  >  (1  —  (|))£p^  ^  and  the  fact  that  x  <  1,  we  obtain  the  path  stretch  from 


5  to  f. 


_ _  2(2  +  x)rip_^^  6rip_^ 

d{sd)  d{s,t)  ~  (l-(|))£pP^ 


¥ 


Y  <  1  +  t:, 


(5.7) 


where  £  >  (logp  8 /sx)  +  2. 

Finally,  the  algorithm  switches  trees  for  at  most  k  —  £  times  to  finally  route  wifhin  a  level  £  cluster,  afler 
which  if  swifches  free  af  mosf  once,  fhus  adding  up  fo  a  fofal  number  ofk  —  £+\  limes. 

Now  we  look  af  fhe  bound  on  k  ilself.  When  d{sd)  <  (1  —  <lt)£p”,  for  all  n  <  h,  we  have  k  <  n 
by  Lemma  5.1.  We  now  verify  fhaf  all  sfafemenls  in  fhe  fheorem  sfill  apply,  for  fhe  clusfered  pafh 
when  d{s,t)  >  (1  —  (|))£p(*i  and  Ck{s,t)  is  Ch{s,t)-  Firsf  of  all,  when  Ck{s,t)  —  Ch{s,t),  following  fhe 
Forwarding  algorifhm  in  Secfion  4.3,  we  know  fhaf  C'{t)  is  af  level  h  —  £  and  hence  d{s,t)  >  (1  —  (|))£p^^“^^ 
by  Lemma  5.2.  The  shorfesf  pafh  from  5  lo  eo  G  C'{t),  is  internal  fo  Ch{s,t),  the  entire  nefwork  G.  Upon 
reaching  eo,  is  known  by  applying  fhe  fheorem  direclly  since  d{eod)  <  2'\\h-E  <  (1  —  (|))£p^*^'.  Thus  we 
have  <  (1  +x)d{eod)  <  (1  +'t)2r[h-i-  Hence,  fhe  entire  pafh  satisfies  fhe  equation,  +  ^eof 
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Second,  with  the  same  calculation  as  the  proof  above,  it  is  easy  to  verify  that  the  entire  path  h%  has  a  stretch 
of  at  most  (1  +  x)  given  that  r/ (5,  t)  >  (1  —  (|))sp(^“^)  and/i^^^  <  (l+x)(i(eo,0  ^  (1 +  x)2ri/,_£.  The  algorithm 
switches  trees  for  at  most  {h  —  £+l)  times.  ■ 

Corollary  5.1.  For  all  t  such  that  d{s^t)  <  ep^,  path  stretch  is  1. 

Proof.  Node  5  has  a  routing  table  entry  for  all  t  such  that  <i (5,  t)  <  sp^,  since  6(5,  C  6(5,  sp^)  is  fully 

contained  in  some  level-.^  cluster  Ci{s)  G  in  some  tree  Tj,  and  C'{t)  is  Co(t)  €  the  base  case  of  the 

above  proof  shows  that  path  stretch  is  1.  ■ 
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